Retinal predictions using predicted future optical coherence tomography (OCT) images

The system generates synthetic predicted OCT imaging data using machine learning models to forecast retinal changes, addressing the challenge of predicting disease progression and treatment response in DME and nAMD, thereby enhancing disease management.

WO2026122958A1PCT designated stage Publication Date: 2026-06-11GENENTECH INC

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
GENENTECH INC
Filing Date
2025-12-05
Publication Date
2026-06-11

Smart Images

  • Figure US2025058416_11062026_PF_FP_ABST
    Figure US2025058416_11062026_PF_FP_ABST
Patent Text Reader

Abstract

A method and system for generating synthetic predicted OCT imaging data are provided. Input data including characteristic data and baseline imaging data associated with a subject at a baseline timepoint may be received. Input data associated with the subject for a set of future timepoints may be received. First and second machine learning models may be used to generate predicted OCT imaging data associated with the retina of the subject for the set of future timepoints, based on the input data.
Need to check novelty before this filing date? Find Prior Art

Description

Attorney Docket: 59868.78WO01RETINAL PREDICTIONS USING PREDICTED FUTURE OPTICAL COHERENCE TOMOGRAPHY (OCT)IMAGESInventors: Yusuke Alexander Kikuchi and Qi YangCROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application is related to and claims the benefit of the priority date of U. S. Provisional Patent Application No. 63 / 728,668 filed December 5, 2024, and entitled " RETINAL PREDICTIONS USING PREDICTED FUTURE OPTICAL COHERENCE TOMOGRAPHY (OCT) IMAGES / ' and U. S. Provisional Patent Application No. 63 / 799,225 filed May 2, 2025, and entitled " RETINAL PREDICTIONS USING PREDICTED FUTURE OPTICAL COHERENCE TOMOGRAPHY (OCT) IMAGES," both of which are incorporated herein by reference in their entirety.FIELD

[0002] This application relates to the generation of synthetic predicted optical coherence tomography (OCT) imaging data, and more particularly, to prediction of treatment response or disease progression using automated generation of synthetic predicted OCT imaging data.BACKGROUND

[0003] Retinal diseases, such as diabetic macular edema (DME) and neurovascular age-related macular degeneration (nAMD) are leading causes of vision loss in subjects 50 years and older. Some subjects with DME or nAMD may develop distorted vision or sustain retinal damage due to retinal response to disease progression, and may or may not respond to different treatments with varying degrees of success. Accordingly, predicting the response or the progression of retinal features to disease progression and / or treatment administration in individual patients may be important to developing effective treatments for DME and nAMD via clinical trials and providing treatment to individual patients.SUMMARY

[0004] In one or more embodiments, a method and system for generating synthetic predictedAttorney Docket: 59868.78WO01OCT imaging data are provided. Characteristic data associated with a subject at a baseline timepoint may be received. A first machine learning model may be used to generate a baseline optical coherence tomography (OCT) imaging data associated with a retina of the subject based on the characteristic data. Input data associated with the subject for a set of future timepoints may be received. A second machine learning model may be used to generate predicted OCT imaging data associated with the retina of the subject for the set of future timepoints, based on the input data and the baseline OCT imaging data.BRIEF DESCRIPTION OF DRAWINGS

[0005] For a more complete understanding of the principles disclosed herein, and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:

[0006] Figure 1 illustrates a block diagram of an image generation system in accordance with various embodiments.

[0007] Figure 2 illustrates a flowchart of an example preprocessing process in accordance with various embodiments of the present disclosure.

[0008] Figure 3 illustrates a diagram of example output images of a diffusion model in accordance with various embodiments of the present disclosure.

[0009] Figure 4 illustrates a block diagram of a latent diffusion model in accordance with various embodiments of the present disclosure.

[0010] Figure 5 illustrates a diagram of a pairing process in accordance with one or more embodiments of the present disclosure.

[0011] Figure 6 illustrates a diagram of an example training process of a diffusion model and ControlNet in accordance with one or more embodiments of the present disclosure.

[0012] Figure 7 illustrates a block diagram of an example autoencoder in accordance with one or more embodiments of the present disclosure.

[0013] Figure 8 illustrates an example diffusion model in accordance with one or more embodiments of the present disclosure.Attorney Docket: 59868.78WO01

[0014] Figures 9A-9D show example predictions in accordance with one or more embodiments of the present disclosure.

[0015] Figure 10 illustrates a flowchart of a process for generating synthetic predicted OCT imaging data in accordance with various embodiments of the present disclosure.

[0016] Figure 11 illustrates a block diagram of a computer system in accordance with various embodiments of the present disclosure.

[0017] It is to be understood that the figures are not necessarily drawn to scale, nor are the objects in the figures necessarily drawn to scale in relationship to one another. The figures are depictions that are intended to bring clarity and understanding to various embodiments of apparatuses, systems, and methods disclosed herein. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts. Moreover, it should be appreciated that the drawings are not intended to limit the scope of the present teachings in any way.Attorney Docket: 59868.78WO01DETAILED DESCRIPTIONI. Overview

[0018] The embodiments described herein recognize that predicting changes in certain anatomical and functional features (e.g., fluid volume, layer thickness, etc.) may be important for managing retinal diseases such as, for example, diabetic macular edema (DME) and neovascular age-related macular degeneration (nAMD). For example, being able to accurately and reliably predict changes in anatomical and functional features of a subject in response to disease progression and / or administration of a treatment may be helpful in managing the treatment of DME or nAMD. For example, having an automated system and method for synthetic generation of optical coherence tomography (OCT) imaging data to predict changes in anatomical features may allow for generation of a personalized treatment regimen for a subject with retinal disease, for mitigating retinal damage, for understanding a subject's retinal disease pathogenesis.

[0019] Generation of synthetic future OCT imaging may be used to predict a future status (e.g., anatomical and / or functional features) of a retina affected by retinal diseases such as nAMD and DME. For instance, generation of synthetic future OCT imaging may be used to predict a subject's functional features (e.g., visual acuity) and / or anatomical features (e.g., layer thickness) to progression of particular ophthalmological condition (e.g., nAMD) and / or a particular treatment.

[0020] OCT is an imaging technique in which light is directed at a biological sample (e.g., biological tissue) and the light that is reflected from features of that biological sample is collected to capture two- dimensional or three-dimensional, high-resolution cross-sectional images of the biological sample.

[0021] In several embodiments, a diffusion model, such as a latent diffusion model (LDM) may be used to generate synthetic image data, such as baseline OCT image data associated with an ophthalmological condition of a retina of a subject at a baseline timepoint (e.g., beforeAttorney Docket: 59868.78WO01treatments). Additional conditions may be implemented to control the diffusion model to further generate a predicted OCT image based on baseline OCT image and inputted data.

[0022] While methods of directly predicting future values of various retinal biomarkers exist, it may be difficult to accurately predict the future values for certain biomarkers (e.g., retinal layer thickness and / or fluid volume in the retina). A method that is more interpretable and could ease the difficulty is generating synthetic predicted imaging data, that includes the anatomical features from which biomarker data is derived in their predicted form, such that the synthetically generated predicted imaging data itself may be analyzed for biomarker data at a future point in time.

[0023] Thus, the embodiments described herein recognize that it may be desirable to have systems and methods for automating the generation of synthetic predicted OCT imaging data. For example, it may be desirable to have systems and methods of accurately and reliably generating synthetic predicted OCT imaging data in order to predict anatomical and / or functional response to disease progression and / or treatment administration.

[0024] Recognizing and taking into account the importance and utility of a methodology and system that can provide the improvements described above, the specification describes various embodiments for automated generation of synthetic future OCT images. More particularly, the specification describes various embodiments of methods and systems for accurately and reliably generating synthetic predicted OCT imaging data, using a machine learning system (e.g., a deep learning system, which may be a neural network system), to predict responses of anatomical and / or functional features to disease progression and / or treatment.II. Example System for Generation of Synthetic OCT Imaging Data

[0025] Figure 1 is a block diagram of an image generation system 101 in accordance with various embodiments. Image generation system 101 is used to generate synthetic OCT imaging data for retinas of subjects. In one or more embodiments, image generation system 101 includes computing platform 102, data storage 104, and display system 106. Computing platform 102 may take various forms. In one or more embodiments, computing platform 102 includes a single computer (or computer system) or multiple computers in communication with each other. In other examples, computing platform 102 takes the form of a cloud computingAttorney Docket: 59868.78WO01platform, a mobile computing platform (e.g., a smartphone, a tablet, etc.), or a combination thereof.

[0026] Data storage 104 and display system 106 are each in communication with computing platform 102. In some examples, data storage 104, display system 106, or both may be considered part of or otherwise integrated with computing platform 102. Thus, in some examples, computing platform 102, data storage 104, and display system 106 may be separate components in communication with each other, but in other examples, some combination of these components may be integrated together.

[0027] The image generation system 101 may further include data system 108. Data system 108 may be configured to generate or store data regarding a subject. For example, in one or more embodiments, data system 108 may generate or store characteristic data (e.g., characteristic data 110) associated with a subject. Characteristic data may include, for example, and without limitation, a subject's age, sex, race, ethnicity, visit history, diagnosis of an ophthalmological condition (e.g., a retinal disease), a treatment history and / or a current treatment, or a combination thereof. In some embodiments, characteristic data may be associated with the retina of a subject. Characteristic data may include extracted information associated with the retina of the subject. For example, characteristic data may be identified and / or determined based on an image of a retina of a subject at, for example, a baseline timepoint. In some embodiments, characteristic data may include measurements in OCT imaging data of the retina of a subject at a baseline timepoint. In some embodiments, the retina is a healthy retina. In other embodiments, the retina is one that has been diagnosed with or is suspected of having a retinal disease. For example, the diagnosis may be one of neurovascular age-related macular degeneration (nAMD), diabetic macular edema (DME), or some other type of retinal disease. In various embodiments, characteristic data may include textual data (e.g., non-image data) received by, for example and without limitation, a user input, as discussed further below. In other embodiments, characteristic data may include extracted information (e.g., non-image data) via image analysis of a received image of a retina of a subject (e.g., measurements from an image of a retina of a subject).Attorney Docket: 59868.78WO01

[0028] In one or more embodiments, the data system 108 includes a system or machine that is configured to generate characteristic data 110 for the tissue of a subject. For example, data system 108 may be used to generate characteristic data 110 for the retina of a subject, or for a subject diagnosed with an ophthalmological condition. In some instances, data system 108 can be a large tabletop configuration used in clinical settings, a portable or handheld dedicated system, or a "smart" system incorporated into user personal devices such as smartphones.

[0029] In various embodiments, the image generation system 101 may be configured to receive characteristic data 110 associated with a subject at a baseline timepoint. In some embodiments, characteristic data 110 may be received by system 101 from a user input. For example, characteristic data 110 may be received by a user input using a user interface of system 101 or may be received through a transmission from a remote but communicatively connected device (e.g., a remote user device or data system 108).

[0030] Image generation system 101 may be in communication with data system 108 via network 112. Network 112 may be implemented using a single network or multiple networks in combination. Network 112 may be implemented using any number of wired communications links, wireless communications links, optical communications links, or combination thereof. For example, in various embodiments, network 112 may include the Internet or one or more intranets, landline networks, wireless networks, and / or other appropriate types of networks. In another example, the network 112 may comprise a wireless telecommunications network (e.g., cellular phone network) adapted to communicate with other communication networks, such as the Internet. In some cases, network 112 includes at least one of a local area network (LAN), a virtual local area network (VLAN), a wide area network (WAN), a public land mobile network (PLMN), the Internet, or another type of network. The data system 108 and image generation system 101 may each include one or more electronic processors, electronic memories, and other appropriate electronic components for executing instructions such as program code and / or data stored on one or more computer readable mediums to implement the various applications, data, and steps described herein. For example, such instructions may be stored in one or more computer readable media such as memories or data storage devices (e.g., dataAttorney Docket: 59868.78WO01storage 104) internal and / or external to various components of image generation system 101, and / or accessible over network 112.

[0031] Although only one of each of data system 108 and the image generation system 101 is shown, there can be more than one of each in other embodiments. Further, although Figure 1 shows the data system 108 and the image generation system 101 as two separate components, in some embodiments, the data system 108 and the image generation system 101 may be parts of the same system (e.g., and maintained by the same entity such as a health care provider or clinical trial administrator). In some cases, a portion of image generation system 101 may be implemented as part of data system 108. For example, image generation system 101 may be configured to run as a module implemented using a processor, microprocessor, or some other hardware component of data system 108. In other embodiments, image generation system 101 may be implemented within a cloud computing system that can be accessed by or otherwise communicate with data system 108 using, for example, a network (e.g., network 112).

[0032] In various embodiments, image generation system 101 may process characteristic data 110 using a machine learning model (ML), such as first machine learning model 120 (also referred to herein as a "first model") and second machine learning model 126 (also referred to herein as a "second model") to generate synthetic OCT imaging data. For instance, image generation system 101 may be configured to generate, using first model 120 and second model 126, a predicted optical coherence tomography (OCT) imaging data associated with a retina of the subject, based on the characteristic data 110 and / or a baseline OCT image. In some embodiments, first model 120 may include a neural network, such as an artificial neural network (ANN) or a convolutional neural network (CNN). In other embodiments, first model 120 may be a diffusion model, such as a latent diffusion model (e.g., latent diffusion model 122).

[0033] In one or more embodiments, first model 120 includes a deep learning model, which may be implemented using one or more neural network systems. For example, first model 120 may include a diffusion model, such as a latent diffusion model 122 (also referred to herein as " LDM"). In one or more embodiments, LDM 122 includes a generative model configured toAttorney Docket: 59868.78WO01generate new data (e.g., imaging data) based on input data, such as input characteristics Diffusion model may include a latent diffusion model 122 that uses a VAE to compress the image. In one or more embodiments, LDM 122 may use a variational autoencoder (VAE) to compress, using an encoder of the VAE, an image from a pixel space to a latent space (e.g., the VAE compresses the image to a lower dimensional representation in the latent space). The LDM may then use forward and / or reverse diffusion processes (data / noise transformations) in the latent space to facilitate efficient processing of data (e.g., generate a random tensor in latent space to allow for fast processing speeds by system 101). Subsequently, the VAE may be used to restore, using a decoder of the VAE, the image from the latent space to the pixel space, thus providing an outputted image. In various embodiments, generating an output (e.g., denoising an image or one or more video frames) may be based on text input, image input, and / or the like. For example, characteristic data 110 may include text describing one or more aspects of a subject and may be inputted into LDM 122 to generate OCT imaging data (e.g., an OCT image), which is outputted based on text conditioning (e.g., text-to-image). In some embodiments, LDM 122 may be a U-Net architecture.

[0034] In one or more embodiments, first model 120 is trained using a first training dataset 124. In various embodiments, first training dataset 124 may include example characteristic data inputs and an accompanying OCT images. In various embodiments, the correlated example real OCT imaging outputs may be preprocessed prior to their use in first training dataset 124. For example, preprocessing the example real OCT imaging data outputs may include performing a set of preprocessing operations. The set of preprocessing operations may include, for example without limitation, at least one of masking a vitreous humor of the eye based on inner limiting membrane (ILM) detection, flattening a retinal layer, moving a height of the flattened layer to a fixed height, cropping a width and / or height of the B-scan, normalizing the B-scan, executing a pixel intensity normalization operation, a scaling operation, a resizing operation, a horizontal flipping operation, a vertical flipping operation, a cropping operation, a rotation operation, a noise filtering operation, or the like, any combination thereof of, or some other type of preprocessing operation. In one or more embodiments, training dataset 124 may be received from user input (e.g., a user inputting example inputs and outputs using a userAttorney Docket: 59868.78WO01interface of system 101). In other embodiments, training dataset 124 may be retrieved from a memory component or database (e.g., data storage 104) communicatively connected to computing platform 102. For example, training dataset 124 may include clinical data stored in data storage 104. In other embodiments, training dataset 124 may include historical inputs and outputs. In some embodiments, training dataset 124 may include compiled data (e.g., clinical data) and / or historical training data, where historical training data includes previously received inputs and corresponding determined outputs (e.g., historical inputs and outputs that have been fed back into system 101). In some embodiments, the neural network may iteratively be updated using previously used inputs and determined outputs.

[0035] In some embodiments, the preprocessing operations may require segmentation of the correlated example real OCT imaging outputs in order to detect retinal layers (e.g., the ILM). For example, the correlated example real OCT imaging outputs may be segmented to detect retinal elements according to one or more techniques as described in International Publication No. WO2023205511A1, which is incorporated by reference herein in its entirety. A retinal element may be comprised of at least one of a retinal layer element or a retinal pathological element. Detection and identification of one or more retinal layer elements may be referred to as layer element (or retinal layer element) segmentation. Detection and identification of one or more retinal pathological elements may be referred to as pathological element (or retinal pathological element) segmentation.

[0036] Baseline OCT imaging data may include information associated with a state of a retina at a baseline timepoint (e.g., a current and / or present timepoint). For example, baseline OCT imaging data 125 may include information (e.g., a baseline OCT image) related to a current state of a disorder of the eye (e.g., a retinal disorder, such as an ophthalmological condition of a retina). A current state of a retinal disorder (e.g., nAMD) may include a current progression of the disorder or current anatomical and / or functional response to a past treatment for the disorder. In various embodiments, baseline OCT imaging data 125 may include one or more synthetic images (e.g., generated images) or real images (e.g., captured images) and / or videos, such as, for example, synthetic OCT volumes, OCT B-scans, and / or OCT "slice" images.Attorney Docket: 59868.78WO01

[0037] In one or more embodiments, image generation system 101 may be configured to receive input data associated with the subject. Input data 132 may include characteristic data 110, baseline OCT imaging data 125 (e.g., a real baseline OCT image), or the like, as previously discussed herein. In some embodiments, input data 132 may include for example, and without limitation, a set of future timepoints, where the future timepoints are further in time relative to the baseline timepoint. For example, input data 132 may include a future timepoint of thirty days, two hundred days, three months, one year, and the like, from the baseline timepoint of the baseline OCT imaging data. In other embodiments, input data 132 may include a treatment schedule. A treatment schedule may include a type of treatment to be administered to the subject, a dosage amount of the treatment, a frequency of the administration of the treatment, and the like. In some embodiments, there may be no treatment schedule so that the natural, future progression of the ophthalmological condition without administration of a treatment may be determined, as discussed further below.

[0038] In one or more embodiments, image generation system 101 may receive input data (e.g., input data 132), associated with the retina of the subject for a set of future timepoints. In one or more embodiments, image generation system 101 may receive input data 132 from data system 108. For example, input data 132 may include a selected future timepoint of a set of future timepoints, and / or a treatment schedule associated with the retina of the subject, taking place over the set of future timepoints. In various embodiments, the treatment schedule may be associated with an ophthalmologic condition that the retina of the subject has been diagnosed with or is at risk of developing. For example, the ophthalmologic condition may be a retinal disease such as DME or nAMD, as previously mentioned above.

[0039] In one or more embodiments, image generation system 101 uses the baseline OCT imaging data (e.g., baseline OCT imaging data 125) to generate predicted OCT imaging data (e.g., predicted OCT imaging data 136), using a second machine learning model, such as second machine learning model 126. For example, first model 120 and second model 126 may output synthetic predicted OCT imaging data (e.g., predicted OCT imaging data 136) associated with the retina of the subject for a set of future timepoints, based at least on baseline OCT imaging data (e.g., baseline OCT imaging data 125) and characteristic data 110. Second model 126 mayAttorney Docket: 59868.78WO01include a different machine learning model from first machine learning model 120. For example, in some embodiments, second model 126 may include a trainable control branch ControlNet (CN) neural network (e.g., ControlNet 128) that provides additional conditions (e.g., spatial conditioning) to first model 120. It will be appreciated that though the use of OCT B-scans is described herein, OCT volumes may instead be used for one or more of the processes described herein without departing from the scope and spirit of the disclosure.

[0040] The ControlNet neural network may include a neural network and / or a trainable control branch configured to guide and / or control image generation by first model 120. For instance, a CN neural network may include a modular add-on that provides one or more conditions in addition to, for example, an inputted text (e.g., characteristic data, such as characteristic data 110). In some embodiments, ControlNet 128 may guide latent diffusion model 122 in generating predicted OCT imaging data 136 based on input data 132. In various embodiments, ControlNet 128 may include a conditioning architecture for a diffusion model, such as LDM 122. ControlNet 128 may be built on a backbone and / or convolutional layers of LDM 122. For instance, ControlNet 128 may mirror parts of the pretrained diffusion model (e.g., weights, inductive biases, or the like) and add additional conditions. In various embodiments, the trainable control branch is trained to condition the generation process.

[0041] In non-limiting embodiments, LDM 122 may operate on a representation space. LDM 122 may be trained to denoise real images with induced noise. In inference, LDM 122 may generate a synthetic image from input conditions (e.g., characteristic data 110) and noise. In various embodiments, the decoder may include a decoder from a pre-trained autoencoder. The autoencoder may include an encoder and a decoder, as further discussed herein. As discussed further in Figure 4, inputs may include a B-scan and additional variables (e.g., age, sex, distance from central scan, target visit, last treatment before target visit, and so on). In some embodiments, ControlNet 128 may act on LDM 122 (e.g., guide image generation) so that the generated image (e.g., predicted OCT image) is based on the input OCT image (e.g., B-scan or OCT volume). In the training of ControlNet 128, trained LDM 122 may be frozen and ControlNet 128 may be trained on image pairs (e.g., the paired input-output B-scans, asAttorney Docket: 59868.78WO01discussed further below). In other embodiments, first model 120 (LDM 122) and second model 126 (e.g., ControlNet 128) may be trained simultaneously.

[0042] In one or more embodiments, at least second model 126 is trained using a second training dataset 134. In various embodiments, second training dataset 134 includes example input data inputs, corresponding example baseline OCT imaging data inputs, and correlated example future OCT imaging data training targets (e.g., example outputs). In various embodiments, the corresponding example baseline OCT imaging data inputs and / or the correlated example future OCT imaging training targets may be preprocessed prior to their use in second training dataset 134. For example, preprocessing the example OCT imaging data outputs may include performing a set of preprocessing operations. The set of preprocessing operations may include, for example without limitation, at least one of masking a vitreous humor of the eye based on inner limiting membrane (ILM) detection, flattening a retinal layer, moving a height of the flattened layer to a fixed height, cropping a width and / or height of the B-scan, normalizing the B-scan, executing a pixel intensity normalization operation, a scaling operation, a resizing operation, a horizontal flipping operation, a vertical flipping operation, a cropping operation, a rotation operation, a noise filtering operation, or the like, any combination thereof of, or some other type of preprocessing operation.

[0043] In some embodiments, second training dataset 134 includes example OCT image pairs, as previously mentioned. For example, and without limitation, second training dataset 134 may include an example baseline OCT image and a correlated example future OCT image. The example OCT image pairs may be provided by a user input and / or retrieved from a database (e.g., data storage 104). In various embodiments, OCT image pairs may be OCT images associated with clinical data. In one or more embodiments, sets of example OCT image pairs may be used, where each set of example OCT image pairs are associated with a type of treatment. In one or more embodiments, second training dataset 134 may include OCT image pairs of OCT images associated with natural disease progression. For example, second training dataset 134 may include OCT image pairs of OCT images associated with natural progression of retinal diseases, such as for example without limitation, DME or nAMD.Attorney Docket: 59868.78WO01

[0044] In some embodiments, and as discussed further in Figure 5, determining image pairs (e.g., image pairing) used for training data, such as second training dataset 134, may include forming an input-output pair for model training. B-scans of substantially similar locations relative to a center volume may be matched such that a B-scan from a prior time t₁ may be set as an example input and a B-scan taken at a different time from t₁, such as a subsequent time t₂, may be set as an example output.

[0045] Second training dataset 134 may be used to train second model 126 so that second model can predict a future state of a retina of a subject based on at least a baseline state of the retina of the subject. More specifically, system 101 may be configured to generate, using second model 126, predicted optical coherence tomography (OCT) imaging data associated with the retina of the subject for the set of future timepoints, based on the input data and the baseline OCT imaging data.

[0046] In various embodiments, predicted OCT imaging data 136 may include synthetic future images, such as, for example, synthetic OCT volumes, OCT B-scans, and / or OCT "slice" images. For example, predicted OCT imaging data 136 may include predicted synthetic OCT imaging data at a future timepoint, including, for example without limitation, one or more predicted features associated with predicted responses of the retina based on, at least in part, a selected future timepoint, the treatment schedule, and / or the baseline OCT imaging data (e.g., baseline OCT imaging data 125).

[0047] In some embodiments, predicted OCT imaging data 136 may be used to predict values (e.g., quantitative measurements such as thickness, length, width, area, volume, and the like) for a set of biomarkers for the future set of timepoints. For instance, image generation system 101 may be configured to predict biomarker features based on extracted (e.g., determine and / or identify) biomarker values and / or measurements of predicted OCT imaging data 136 (e.g., predicted OCT image). In one or more embodiments, predicted OCT imaging data 136 may be used to predict values for a set of biomarkers for the future set of timepoints in response to a treatment schedule. In one or more embodiments, predicted OCT imaging data 136 may be used to predict values for a set of biomarkers for the future set of timepoints in response to natural progression of a disease (e.g., a retinal disease such as DME or nAMD). In variousAttorney Docket: 59868.78WO01embodiments, the set of biomarkers may include a retinal layer element, a retinal pathological element, or a combination thereof. The values for the set of biomarkers may include, for example without limitation, values for layer thickness of a retinal layer element, or values for fluid volume of a retinal pathological element.

[0048] In one or more embodiments, image generation system 101 uses characteristic data (e.g., characteristic data 110) and baseline OCT imaging data (e.g., baseline OCT imaging data 125) to generate predicted OCT imaging data (e.g., predicted OCT imaging data 136), using first and second machine learning model, such as first machine learning model 120 and second machine learning model 126, both of which have been trained in a two-stage (e.g., dual-stage) training process as discussed above. In one or more embodiments, generation of the predicted OCT imaging data 136 (e.g., predicted OCT image) using the diffusion model may be done by first model 120 using guidance from second model 126. For example, generation of the predicted OCT image may involve the ControlNet (e.g., ControlNet 128) providing additional conditions for LDM 122 to generate the predicted OCT image based on inputs data, such as a baseline OCT image and characteristic data.

[0049] As previously mentioned, first and second models may each include a neural network. In an aspect, the neural network may be a convolutional neural network (CNN). In some embodiments, the neural network may be implemented by computing platform 102. First neural network (e.g., first model) may be used to process characteristic data and real baseline OCT imaging data. Second neural network (e.g., second model) may be used to guide the processing by the first model to generate synthetic predicted OCT imaging data. In some cases, neural network(s), may detect / identify features (e.g., feature extraction) of inputted images (e.g., synthetic or real baseline OCT imaging data) to determined characteristic data, as previously discussed herein. In some cases neural network(s) may detect / identify features (e.g., feature extraction) of outputted images (e.g., synthetic or real predicted OCT imaging data). Imaging data may refer to numerical values that represent pixel values (e.g., pixel intensities) of an image, and thus imaging data may include raw data used to create the image (e.g., visual representation) shown on a display for a user to view, such as display system 106 of system 101. A feature or region of interest within an image may include information associated withAttorney Docket: 59868.78WO01features (e.g., anatomical features or biomarkers) of the retina, and be used to determine predict values for a set of biomarkers for the future set of timepoints, as previously discussed herein. Such features may be shown on the display for a user to view. The images displayed may further include annotations, such as highlights, comments, numerical values or measurements, markings, any combination thereof, and so on.

[0050] In one or more embodiments, any of the neural networks described herein may include various nodes (e.g., neurons) arranged in multiple layers including an input layer receiving one or more inputs, hidden layers, and an output layer providing one or more outputs. The input(s) may collectively provide a training dataset for use in training the neural network. Although particular numbers of nodes and layers are discussed, any desired number of such features may be provided in various embodiments. The training dataset may include any training datasets discussed within this disclosure (e.g., first training dataset 124 and / or second training dataset 134).

[0051] In some embodiments, the neural networks may operate as a multi-layer classifier using a set of non-linear transformations between the various layers to extract features and / or information from images (e.g., quantitative measurements). In various embodiments, the neural network may be trained on large amounts of data and may be iteratively trained using updated training datasets (e.g., an updated first training dataset and / or updated second training dataset) until the neural network has trained on enough data such that the neural network can perform predictions, or more accurate predictions, of its own.

[0052] In some embodiments, updated training datasets may be obtained by analyzing historical inputs and outputs of the neural network that may be presented (e.g., displayed) to a user such that the user has an opportunity to review the data and provide user input to adjust the data as appropriate. The user input may be analyzed and fed back to update a training dataset used to train the neural network. In this regard, the user input may be provided in a backward pass through the neural network to update neural network parameters and / or conditions based on the user input. In some aspects, the backward pass may include back propagation and gradient descent. In some aspects, by adjusting the training dataset toAttorney Docket: 59868.78WO01improve accuracy, the user may avoid costly delays in implementing accurate feature classifications, synthetic imaging data generation, and so forth.

[0053] Figure 2 shows an example preprocessing process 200 in accordance with one or more embodiments of the present disclosure. In various embodiments, preprocessing may include image standardization. As previously discussed, the correlated example OCT imaging training targets / outputs may be preprocessed prior to their use in training data, such as first training dataset 124 of Figure 1. In the example illustrated in Figure 2, the process 200 involves using a raw B-scan image 202, to create images 204, 206, 208, and 230.

[0054] As shown in step 205, process 200 may include providing the raw B-scan image 202.

[0055] As shown in step 210, process 200 may include masking a vitreous humor of the eye based on inner limiting membrane (ILM) detection. For example, process 200 may include masking the vitreous humor of raw B-scan 202 to create image 204.

[0056] As shown in step 215, process 200 may include flattening a retinal layer to create image 206. In some embodiments, process 200 may include moving a height of the flattened layer to a fixed height.

[0057] As shown in step 220, process 200 may include cropping a width and / or height of the B-scan to create image 208.

[0058] As shown in step 225, process 200 may include executing a pixel intensity normalization operation to create image 230.

[0059] In other embodiments, process 200 may include a scaling operation, a resizing operation, a horizontal flipping operation, a vertical flipping operation, a cropping operation, a rotation operation, a noise filtering operation, any combination thereof of, or the like.

[0060] Figure 3 shows example output images (e.g., predicted OCT imaging data 136) that may include, for example, predicted future images 306a and 306b in accordance with one or more embodiments of the present disclosure. More specifically, Figure 3 shows real future images 304a and 304b compared to corresponding predicted future images 306a and 306b respectively, which are generated by the diffusion model (e.g., LDM 122) and ControlNet 128 from baseline images 302a and 302b, respectively, and non-image features (e.g., characteristic data 110), as discussed above herein.Attorney Docket: 59868.78WO01

[0061] Figure 4 shows a block diagram of an example LDM 122 in accordance with one or more embodiments of the present disclosure. ControlNet 128 may include a trainable control branch that may guide image generation by diffusion model 122, as previously discussed herein. LDM 122 may be configured to generate predicted OCT imaging data 136 (e.g., predicted future OCT image 410). As previously discussed herein, LDM 122 may operate on a representation space and may generate a synthetic image from input conditions and noise. In various embodiments, the decoder may include a decoder 408 from a pre-trained autoencoder. In various embodiments, the autoencoder may also include an encoder 406. Inputs may, for example, include characteristic data 404 (e.g., age, sex, distance from central scan, target visit, last treatment before target visit, and so on) and / or OCT B-scan 402. The output may include predicted future image 410. In the training of ControlNet 128, trained LDM 122 may be frozen and ControlNet 128 may be trained on the paired input -output B-scans.

[0062] Figure 5 shows an example pairing process for creating training data (e.g., second training dataset 134 of Figure 1) in accordance with one or more embodiments of the present disclosure. Determining image pairs (e.g., image pairing and / or paired OCT B-scans) used for training data may include forming an input-output pair for model training, such as training ControlNet 128. As shown in Figure 5, B-scans of substantially similar locations relative to a center volume 502 may be matched such that B-scans 506a-e from a prior time t₁ may be set as example inputs and B-scans 508a-e taken at a different time from t₁, such as a subsequent time t₂, may be set as example outputs. For example, input B-scan 506a at a location A from time ti may be paired with corresponding output B-scan 508a at location A from time t2.

[0063] In several embodiments, model development may be two-staged. For instance, a first machine learning model and / or diffusion model (e.g., latent diffusion model 122) may be first trained to generate synthetic OCT B-scan images based on characteristic data. For instance, latent diffusion model 122 may be trained using first training dataset 124, as described in Figure 1. Next, a second machine learning model and / or control branch (e.g., ControlNet 128) may be trained using second training dataset 134 (e.g., the paired B-scans). To form the paired B-scans, B-scans from the same eye at different visits (e.g., at time t₁ and time t₂) may be matched using the location of the scans, as previously discussed. The second machine learning model and / orAttorney Docket: 59868.78WO01control branch may then guide the diffusion model's generation of the B-scan at a target visit (e.g., predicted OCT imaging data 136) using an input baseline B-scan and additional variables (e.g., characteristic data).

[0064] Figure 6 illustrates a diagram of an example dual-stage training process of a diffusion model and trainable control branch (e.g., ControlNet) in accordance with one or more embodiments of the present disclosure. As previously discussed herein, a diffusion model, such as LDM 122, may be trained to denoise real images with induced noise. In various embodiments, LDM 122 may generate a synthetic image from input conditions (e.g., characteristic data) and noise. As shown in block 605, autoencoder 610 of LDM 122 may be trained). Autoencoder 610 may include a pre-trained autoencoder having an encoder (e.g., encoder 602) and a decoder (e.g., decoder 604), where training data of autoencoder 610 includes example inputs of B-scans and / or additional variables (e.g., characteristic data) and example outputs may include real B-scan like images.

[0065] As shown in block 610, LDM 122 may be trained. Example inputs may include at least characteristic data and / or B-scan images and example outputs may include predicted future OCT images.

[0066] As shown in block 615, in the training of ControlNet 128 trained LDM 122 may be frozen and ControlNet 128 may be trained on the paired input-output B-scans, as discussed in Figure 5.

[0067] Figure 7 illustrates a block diagram of an example autoencoder 610 in accordance with one or more embodiments of the present disclosure. The latent diffusion model may use autoencoder 610 to move between the high-dimensional pixel space and a compact latent space, where the diffusion process operates (shown in Figure 8). Autoencoder 610 may provide various mappings, such as encoder 602, which compresses images into latents (e.g., transforms images into a lower-resolution latent tensor), and decoder 604, which reconstructs images from latents.

[0068] In example embodiments, an input image 702 (e.g., B-scan) may be received by encoder 602, which may perform a quantization (e.g., map the latent space representation and / or compress code) using, for example, convolutional layers.Attorney Docket: 59868.78WO01

[0069] After encoding, the diffusion model may run in the latent space. In some embodiments, forward diffusion (training) may be performed where noise is progressively added to latents to create noisy versions at different timesteps. The latent diffusion model may be trained to predict either the noise or the denoised latent based on the current noisy latent and conditioning (e.g., inputted text). In some embodiments, reverse diffusion (inference) may be performed, where, beginning with noise in latent space, the diffusion model may iteratively denoise until it produces a clean latent. In various embodiments, conditioning (e.g., conditioning 808 of Figure 8, which may include text, images, semantic maps, representations, guidance scale, or the like) may guide the denoising trajectory.

[0070] Decoder 604 may then reconstruct an image in pixel space. For example, decoder 604 may map the compact latent representation back to the image domain (e.g., invert the encoder's compression) to provide a reconstructed image and / or output image.

[0071] Figure 8 illustrates the example processing 800 by diffusion model, such as LDM 122, guided by a trainable control branch in accordance with one or more embodiments of the present disclosure. The latent diffusion model may use autoencoder 610 to move between the high-dimensional pixel space 802 and a compact latent space 804, where the diffusion process 806 operates, as previously described in Figure 7.

[0072] Figures 9A-9D show example generated images in accordance with one or more embodiments of the present disclosure. In a Vabysmo Phase 2 / 3 clinical trial, for nAMD, Avenue (faricimab 6.0 mg) N-85, Stairway (faricimab 6.0 mg) N=55, and T& L (faricimab 6.0 mg) N=620 (approximately 80% with Spectralis), and, for DME, Boulevard (fariciman 6.0 mg) N=70, Y& R (Faricimab 6.0 mg Q8W, PTI) N=1166 (approximately 80% with Spectralis).

[0073] Figure 9A shows example results from a latent diffusion model at week 0 or 4. Figure 9B shows example results from a latent diffusion model at week 0 or 4. Figure 9C shows example results from a latent diffusion model guided by a trainable control branch (e.g., ControlNet 128) at week 0, 4, or 8. Figure 9D shows example results from a latent diffusion model guided by a trainable control branch at week 0, 4, or 8.

[0074] Figure 10 is a flowchart of a process 1000 for generating synthetic predicted OCT imaging data in accordance with various embodiments. In various embodiments, process 1000Attorney Docket: 59868.78WO01is implemented using the image generation system 101 described in Figure 1. For explanatory purposes, process 1000 is primarily described within this disclosure with reference to system 101 and its associated arrangement of components as described in Figures 1 and 3. However, process 1000 is not limited to such implementations. Any step, sub-step, sub-process, or block of process 1000 may be performed in an order or arrangement different from the embodiments illustrated in Figure 10; some may be omitted, others may be added, and some may be performed simultaneously as appropriate.

[0075] In various embodiments, process 1000 includes a dual-stage training process. In other embodiments, process 1000 includes a single stage training process (e.g., training of second model only).

[0076] Step 1002 of process 1000 includes training a first model, such as first model 120. Training of the first model 120 (e.g., diffusion model) may include first model being trained to denoise added noise from one or more real OCT images based on the patient's characteristics. Thus, the trained first model can generate a realistic synthetic OCT image from given characteristics and noise. Training of first model 120 may be consistent with the training of first model 120 previously discussed herein.

[0077] Step 1004 of process 1000 includes training second model 126. Training of the second model 126 may include taking the trained first model 120 and training an additional module (e.g., control branch or neural network), such as second model 126, to generate predicted OCT imaging data 136 (e.g., a future predicted OCT image) from baseline OCT imaging data 125 (e.g., a real baseline OCT image) and the patient's characteristics (e.g., characteristic data 110). Thus, using the trained model, a future predicted OCT image of a given baseline OCT image and patient's characteristics may be generated. Training of second model 126 may be consistent with the training of second model 126, previously discussed herein.

[0078] Step 1006 of process 1000 includes receiving input data associated with the subject, for a set of future timepoints. The input data may be, for example, input data 132 in Figure 1. For example, step 1006 may include receiving characteristic data associated with a subject at a baseline timepoint. The characteristic data may be, for example, characteristic data 110 and baseline imaging data 125 in Figure 1.Attorney Docket: 59868.78WO01

[0079] Step 1008 of process 1000 includes generating, using the trained models, predicted optical coherence tomography (OCT) imaging data associated with the retina of the subject for the set of future timepoints, based on the input data 132. In some embodiments, the trained models include a ControlNet (CN) neural network (e.g., ControlNet 128 in Figure 1) and a deep learning model, such as an LDM 130. In various embodiments, biomarkers may be extracted from predicted OCT imaging data, which may include applying separate algorithms, such as segmentation algorithms, to extract biomarker values.

[0080] The predicted OCT imaging data may be, for example, predicted OCT imaging data 136 in Figure 1. In some embodiments, predicted OCT imaging data may be used to predict values for a set of biomarkers for the future set of timepoints. In various embodiments, the set of biomarkers may include a retinal layer element, a retinal pathological element, or a combination thereof. The values for the set of biomarkers may include, for example without limitation, values for layer thickness of a retinal layer element, or values for fluid volume of a retinal pathological element.

[0081] In some embodiments, the example methodologies described herein provide a technical effect and / or a technical improvement to a technical field of prediction, detection, and / or treatment of retinal diseases such as for example DME or nAMD. For example, in some embodiments, the predicted OCT imaging data 136 described with respect to Figure 1 is used to identify the subject as a subject with a high-risk of progressing to a predetermined stage of DME or nAMD or to a future status over a predetermined period of time, such as for example over 1 year, over 2 years, over 3 years, etc. In some embodiments, a high risk of progressing to a predetermined stage of DME or nAMD is defined as a predicted % change in a value for a set of biomarkers exceeding a % threshold; a predicted % increase or decrease of anatomical and / or functional feature value, or any other values exceeding a parameter threshold. In other embodiments, a high-risk of progressing to a predetermined stage of DME or nAMD is defined in other ways. In some embodiments, the image analysis system 101 identifies the subject as high-risk of progressing to a predetermined stage of DME or nAMD. In some embodiments and in response to the identification of the subject as a high-risk subject, the process 1000 also includes any one or more of the following steps: administering a therapy to the subject;Attorney Docket: 59868.78WO01including or excluding the subject from a clinical trial; including or excluding the subject from a stage of a clinical trial; customizing a dosing schedule of a therapy for the subject; and / or creating a monitoring schedule for the subject.

[0082] In some embodiments and when the input data 132 includes a treatment schedule associated with a first therapy, and when the predicted imaging data 136 described with respect to Figure 1 is used to identify the subject as a subject with a high-risk of progressing to a predetermined stage of DME or nAMD over a predetermined period of time, then the process may also include any one or more of the following steps: administering a second therapy that is different from the first therapy to the subject; including or excluding the subject from a clinical trial for the second therapy; including or excluding the subject from a stage of a clinical trial associated with the second therapy; customize a dosing schedule of a second therapy for the subject; customize a dosing schedule of the first therapy for the subject; and / or creating a monitoring schedule for the subject.

[0083] As described, the embodiments described herein may facilitate the creation of personalized treatment regimens for individual subjects to ensure the proper dosage and / or intervals between treatments. In particular, the embodiments described herein may help generate accurate, efficient, and expedient personalized treatment or dosing schedules and enhance clinical cohort selection or clinical trial design.III. Example Results in Clinical Trials

[0084] The example system uses an artificial intelligence (AI) model that generates an optical coherence tomography (OCT) image for anatomical treatment response prediction after intravitreal injection treatment may be used to improve clinical trial design and improve assessment and prediction of trial data. For instance, the system may generate one or more OCT images for anatomical treatment response prediction after intravitreal injection treatment for neovascular age-related macular degeneration (nAMD).

[0085] In clinical trials, OCT volumes and clinical data from the treatment arms with faricimab 6.0 mg in AVENUE (NCT02699450) and STAIRWAY (NCT03038880) were used to train and validate an example Al system and / or model. For each study eye, B-scans were extracted from the volumetric scans and standardization was applied to remove the exogenous factorsAttorney Docket: 59868.78WO01between images taken at different time points. The standardization consisted of masking the vitreous, flattening the inner limiting membrane (ILM), centering the height of ILM, cropping to center 3.75 mm, resizing and normalization of the pixel intensity. The resulting images have pixel intensity between 0 and 1. The example model development may be two-staged. First, an example latent diffusion model (LDM) was first trained to generate synthetic OCT B-scan images given age, sex, visit (duration from the day 1), and the last treatment visit before the visit. Second, an example Control Net(CN) was trained on the paired B-scans. To form the pairs, B-scans from the same eye at different visits were matched using the location of the scans. The example CN guides the example LDM's generation of the B-scan at the target visit using the input B-scan and the additional variables. The inception score (IS) was used to quantify the quality of generated images of the example LDM and mean absolute error (MAE), peak signal-to-noise ratio (PSNR), and structural similarity index measure (SSIM) were reported for evaluating the performance of the example CN generating the future image against the real future image. The performance of the example CN was evaluated in predicting the B-scan in the central 3 mm at week 16 (after monthly loading doses) from day 1. The metrics for the example CN were calculated, excluding the masked vitreous for a fair evaluation.

[0086] The IS was calculated using 5000 randomly sampled real images from STAIRWAY and 5000 synthetic images from the example LDM, with values of 1.50 and 2.25, respectively. The MAE, PSNR, and SSIM were calculated on 1284 pairs of B-scans, with values of 0.10, 17.77, and 0.30, respectively. Such results showed that an LDM and an example CN may be used for generating future OCT images to predict the anatomical treatment response.IV. Example Implementation of Computer System

[0087] Figure 11 is a block diagram of a computer system in accordance with various embodiments. Computer system 1100 may be an example of one implementation for computing platform 102 described in Figure 1. In one or more examples, computer system 1100 can include a bus 1102 or other communication mechanism for communicating information, and a processor 1104 coupled with bus 1102 for processing information. In various embodiments, computer system 1100 can also include a memory, which can be a random-access memory (RAM) 1106 or other dynamic storage device, coupled to bus 1102 forAttorney Docket: 59868.78WO01determining instructions to be executed by processor 1104. Memory also can be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 1104. In various embodiments, computer system 1100 can further include a read only memory (ROM) 1108 or other static storage device coupled to bus 1102 for storing static information and instructions for processor 1104. A storage device 1110, such as a magnetic disk or optical disk, can be provided and coupled to bus 1102 for storing information and instructions.

[0088] In various embodiments, computer system 1100 can be coupled via bus 1102 to a display 1112, such as a cathode ray tube (CRT) or liquid crystal display (LCD), for displaying information to a computer user. An input device 1114, including alphanumeric and other keys, can be coupled to bus 1102 for communicating information and command selections to processor 1104. Another type of user input device is a cursor control 1116, such as a mouse, a joystick, a trackball, a gesture input device, a gaze-based input device, or cursor direction keys for communicating direction information and command selections to processor 1104 and for controlling cursor movement on display 1112. This input device 1114 typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane. However, it should be understood that input devices 1114 allowing for three-dimensional (e.g., x, y, and z) cursor movement are also contemplated herein.

[0089] Consistent with certain implementations of the present teachings, results can be provided by computer system 1100 in response to processor 1104 executing one or more sequences of one or more instructions contained in RAM 1106. Such instructions can be read into RAM 1106 from another computer-readable medium or computer-readable storage medium, such as storage device 1110. Execution of the sequences of instructions contained in RAM 1106 can cause processor 1104 to perform the processes described herein. Alternatively, hard-wired circuitry can be used in place of or in combination with software instructions to implement the present teachings. Thus, implementations of the present teachings are not limited to any specific combination of hardware circuitry and software.

[0090] The term "computer-readable medium" (e.g., data store, data storage, storage device, data storage device, etc.) or "computer-readable storage medium" as used herein refers to anyAttorney Docket: 59868.78WO01media that participates in providing instructions to processor 1104 for execution. Such a medium can take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Examples of non-volatile media can include, but are not limited to, optical, solid state, magnetic disks, such as storage device 1110. Examples of volatile media can include, but are not limited to, dynamic memory, such as RAM 1106. Examples of transmission media can include, but are not limited to, coaxial cables, copper wire, and fiber optics, including the wires that comprise bus 1102.

[0091] Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, or any other magnetic medium, a CD-ROM, any other optical medium, punch cards, paper tape, any other physical medium with patterns of holes, a RAM, PROM, and EPROM, a FLASH-EPROM, any other memory chip or cartridge, or any other tangible medium from which a computer can read.

[0092] In addition to computer readable medium, instructions or data can be provided as signals on transmission media included in a communications apparatus or system to provide sequences of one or more instructions to processor 1104 of computer system 1100 for execution. For example, a communication apparatus may include a transceiver having signals indicative of instructions and data. The instructions and data are configured to cause one or more processors to implement the functions outlined in the disclosure herein. Representative examples of data communications transmission connections can include, but are not limited to, telephone modem connections, wide area networks (WAN), local area networks (LAN), infrared data connections, NFC connections, optical communications connections, etc.

[0093] It should be appreciated that the methodologies described herein, flow charts, diagrams, and accompanying disclosure can be implemented using computer system 1100 as a standalone device or on a distributed network of shared computer processing resources such as a cloud computing network.

[0094] The methodologies described herein may be implemented by various means depending upon the application. For example, these methodologies may be implemented in hardware, firmware, software, or any combination thereof. For a hardware implementation, the processing unit may be implemented within one or more application specific integrated circuitsAttorney Docket: 59868.78WO01(ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), processors, controllers, micro-controllers, microprocessors, electronic devices, other electronic units designed to perform the functions described herein, or a combination thereof.

[0095] In various embodiments, the methods of the present teachings may be implemented as firmware and / or a software program and applications written in conventional programming languages such as C, C++, Python, etc. If implemented as firmware and / or software, the embodiments described herein can be implemented on a non-transitory computer-readable medium in which a program is stored for causing a computer to perform the methods described above. It should be understood that the various engines described herein can be provided on a computer system, such as computer system 1100, whereby processor 1104 would execute the analyses and determinations provided by these engines, subject to instructions provided by any one of, or a combination of, the memory components RAM 1106, ROM 1108, or storage device 1110 and user input provided via input device 1114.V. Example Definitions and Content

[0096] The disclosure is not limited to these exemplary embodiments and applications or to the manner in which the exemplary embodiments and applications operate or are described herein. Moreover, the figures may show simplified or partial views, and the dimensions of elements in the figures may be exaggerated or otherwise not in proportion.

[0097] Unless otherwise defined, scientific and technical terms used in connection with the present teachings described herein shall have the meanings that are commonly understood by those of ordinary skill in the art. Further, unless otherwise required by context, singular terms shall include pluralities and plural terms shall include the singular. Generally, nomenclatures utilized in connection with, and techniques of, chemistry, biochemistry, molecular biology, pharmacology, and toxicology are described herein are those well-known and commonly used in the art.

[0098] As the terms "on," "attached to," "connected to," "coupled to," or similar words are used herein, one element (e.g., a component, a material, a layer, a substrate, etc.) can be "on," "attached to," "connected to," or "coupled to" another element regardless of whether the oneAttorney Docket: 59868.78WO01element is directly on, attached to, connected to, or coupled to the other element or there are one or more intervening elements between the one element and the other element. In addition, where reference is made to a list of elements (e.g., elements a, b, c), such reference is intended to include any one of the listed elements by itself, any combination of less than all of the listed elements, and / or a combination of all of the listed elements. Section divisions in the specification are for ease of review only and do not limit any combination of elements discussed.

[0099] The term "subject" may refer to a subject of a clinical trial, a person undergoing treatment, a person undergoing anti-cancer therapies, a person being monitored for remission or recovery, a person undergoing a preventative health analysis (e.g., due to their medical history), or any other person or patient of interest. In various cases, "subject" and "patient" may be used interchangeably herein.

[0100] As used herein, "substantially" means sufficient to work for the intended purpose. The term "substantially" thus allows for minor, insignificant variations from an absolute or perfect state, dimension, measurement, result, or the like such as would be expected by a person of ordinary skill in the field but that do not appreciably affect overall performance. When used with respect to numerical values or parameters or characteristics that can be expressed as numerical values, "substantially" means within ten percent.

[0101] As used herein, the term "about" used with respect to numerical values or parameters or characteristics that can be expressed as numerical values means within ten percent of the numerical values. For example, "about 50" means a value in the range from 45 to 55, inclusive.

[0102] The term "ones" means more than one.

[0103] As used herein, the term "plurality" may be 2, 3, 4, 5, 6, 7, 8, 9, 10, or more.

[0104] As used herein, the term "set of" means one or more. For example, a set of items includes one or more items.

[0105] As used herein, the phrase "at least one of," when used with a list of items, means different combinations of one or more of the listed items may be used and only one of the items in the list may be needed. The item may be a particular object, thing, step, operation, process, or category. In other words, "at least one of" means any combination of items orAttorney Docket: 59868.78WO01number of items may be used from the list, but not all of the items in the list may be required. For example, without limitation, "at least one of item A, item B, or item C" means item A; item A and item B; item B; item A, item B, and item C; item B and item C; or item A and C. In some cases, "at least one of item A, item B, or item C" means, but is not limited to, two of item A, one of item B, and ten of item C; four of item B and seven of item C; or some other suitable combination.

[0106] As used herein, a "model" can refer to a system, process, relationship, or set of rules, which is instantiated, stored, and / or executed within a computing environment. The model may include, without limitation, algorithms, parameters, data structures, or executable instructions configured to perform one or more functions when processed by one or more computing devices.

[0107] As used herein, "machine learning" may be the practice of using algorithms to parse data, learn from it, and then make a determination or prediction about something in the world. Machine learning uses algorithms that can learn from data without relying on rules-based programming.

[0108] As used herein, an "artificial neural network" or "neural network" (NN) may refer to computational models that mimic an interconnected group of artificial nodes or neurons that processes information based on a connectionist approach to computation. Neural networks, which may also be referred to as neural nets, can employ one or more layers of nonlinear units to predict an output for a received input. Some neural networks include one or more hidden layers in addition to an output layer. The output of each hidden layer is used as input to the next layer in the network, e.g., the next hidden layer or the output layer. Each layer of the network generates an output from a received input in accordance with current values of a respective set of parameters. In the various embodiments, a reference to a "neural network" may be a reference to one or more neural networks.

[0109] A neural network may compute digital data in two ways: when it is being trained it is in training mode and when it puts what it has learned into practice it is in inference (or prediction) mode. Neural networks learn through a feedback process (e.g., backpropagation) which allows the network to adjust the weight factors (modifying its behavior) of the individual nodes in theAttorney Docket: 59868.78WO01intermediate hidden layers so that the output matches the outputs of the training data. In other words, a neural network learns by being fed training data (learning examples) and eventually learns how to reach the correct output, even when it is presented with a new range or set of inputs. A neural network may include, for example, without limitation, at least one of a Feedforward Neural Network (FNN), a Recurrent Neural Network (RNN), a Modular Neural Network (MNN), a Convolutional Neural Network (CNN), a Residual Neural Network (ResNet), an Ordinary Differential Equations Neural Networks (neural-ODE), or another type of neural network.

[0110] As used herein, "deep learning" may refer to the use of multi-layered artificial neural networks to automatically learn representations from input data such as images, video, text, etc., without human provided knowledge, to deliver highly accurate predictions in tasks such as object detection / identification, speech recognition, language translation, etc.VI. Recitation of Embodiments

[0111] Embodiment 1: A computer-based method comprising: receiving input data associated with a subject at a baseline timepoint, wherein the input data comprises at least characteristic data and baseline optical coherence tomography (OCT) imaging data; generating, using first and second machine learning models (MLs) using one or more processors, predicted optical coherence tomography (OCT) imaging data associated with the retina of the subject for a set of future timepoints, based on the input data; and predicting, using the predicted OCT imaging data, values for a set of biomarkers for the set of future timepoints.

[0112] Embodiment 2 The method of claim 1, wherein: the characteristic data comprises subject age, sex, visit history, diagnosis of an ophthalmological condition, a treatment history, or a combination thereof; and the predicted OCT imaging data comprises one or more predicted features associated with predicted responses of the retina based on, at least in part, the selected future timepoint, the treatment schedule, and / or the baseline OCT imaging data.

[0113] Embodiment 3: The method of claim 1, the input data further comprises a selected future timepoint of the set of future timepoints and / or a treatment schedule associated with the retina of the subject, taking place over the set of future timepoints.Attorney Docket: 59868.78WO01

[0114] Embodiment 4: The method of claim 1, wherein: the set of biomarkers comprises a retinal layer element, a retinal pathological element, or a combination thereof; and the values for the set of biomarkers comprise layer thickness of a retinal layer element or fluid volume of a retinal pathological element.

[0115] Embodiment 5: The method of claim 1, further comprising: training the first ML model using a first training dataset that comprises example characteristic data inputs and correlated example OCT imaging data outputs, wherein the correlated example OCT imaging data outputs are preprocessed prior to use in the first training dataset.

[0116] Embodiment 6: The method of claim 5, wherein preprocessing the example OCT imaging data outputs comprises: performing a set of preprocessing operations on the example OCT imaging outputs, comprising at least one of masking a vitreous humor of the eye based on inner limiting membrane (ILM) detection, flattening a retinal layer, moving a height of the flattened layer to a fixed height, cropping a width and / or height of the B-scan, normalizing the B-scan, executing a pixel intensity normalization operation, a scaling operation, a resizing operation, a horizontal flipping operation, a vertical flipping operation, a cropping operation, a rotation operation, a noise filtering operation, or some other type of preprocessing operation.

[0117] Embodiment 7: The method of claim 1, wherein the second machine learning model comprises a trainable control branch.

[0118] Embodiment 8: The method of claim 7, further comprising: training the trainable control branch model using a second training dataset that comprises example input data inputs, corresponding example baseline OCT imaging data inputs, and correlated example future OCT imaging data outputs, wherein the corresponding example baseline OCT imaging data inputs, the correlated example OCT imaging data outputs, or a combination thereof are preprocessed prior to use in the second training dataset.

[0119] Embodiment 9: The method of claim 7, wherein preprocessing the corresponding example baseline OCT imaging data inputs, the correlated example OCT imaging data outputs, or a combination thereof comprises: performing a set of preprocessing operations on the OCT imaging data, comprising at least one of masking a vitreous humor of the eye based on inner limiting membrane (ILM) detection, flattening a retinal layer, moving a height of the flattenedAttorney Docket: 59868.78WO01layer to a fixed height, cropping a width and / or height of the B-scan, normalizing the B-scan, executing a pixel intensity normalization operation, a scaling operation, a resizing operation, a horizontal flipping operation, a vertical flipping operation, a cropping operation, a rotation operation, a noise filtering operation, or some other type of preprocessing operation.

[0120] Embodiment 10: The method of claim 7, wherein the example future OCT imaging data outputs correspond to the example baseline OCT imaging data inputs.

[0121] Embodiment 11: The method of claim 1, wherein the ML comprises a diffusion model (DM).

[0122] Embodiment 12: The method of claim 11, wherein the trainable control branch comprises a ControlNet (CN) neural network configured to control the diffusion model (DM).

[0123] Embodiment 13: The method of claim 9 or claim 11, wherein the diffusion model is a latent diffusion model (LDM).

[0124] Embodiment 14: The method of claim 1, wherein the predicted optical coherence tomography (OCT) imaging data comprises a predicted future OCT image.

[0125] Embodiment 15: The method of any one of claims 1-14, further comprising: identifying the subject, based on the predicted OCT imaging data, as high-risk of progressing to a predetermined stage of diabetic macular edema (DME) or neurovascular age-related macular degeneration (nAMD); and administering a therapy to the subject in response to the subject being identified as high-risk of progressing to the predetermined stage of DME or nAMD.

[0126] Embodiment 16: The method of any one of claims 1-14, further comprising: identifying the subject, based on the predicted OCT imaging data, as high-risk of progressing to a predetermined stage of diabetic macular edema (DME) or neurovascular age-related macular degeneration (nAMD); and including or excluding the subject from a stage of a clinical trial in response to the subject being identified as high-risk of progressing to the predetermined stage of DME or nAMD.

[0127] Embodiment 17: The method of any one of claims 1-14, wherein the input data comprises a selected future timepoint of the set of future timepoints and a first treatment schedule associated with the retina of the subject, taking place over the set of future timepoints; wherein the first treatment schedule is for a first therapy; wherein the predictedAttorney Docket: 59868.78WO01OCT imaging data comprises one or more predicted features associated with predicted responses of the retina based on, at least in part, the selected future timepoint, the first treatment schedule, and the baseline OCT imaging data; and wherein the method further comprises: identifying the subject, based on the predicted OCT imaging data, as high-risk of progressing to a predetermined stage of DME nAMD; and administering a second therapy to the subject in response to the subject being identified as high-risk of progressing to the predetermined stage of DME or nAMD; wherein the second therapy is different from the first therapy.

[0128] Embodiment 18: A system comprising: one or more data processors; and a non- transitory computer readable storage medium containing instructions which, when executed on the one or more data processors, cause the one or more data processors to perform part or all of one or more methods disclosed in claims 1-17.

[0129] Embodiment 19: A computer-program product tangibly embodied in a non-transitory machine-readable storage medium, including instructions configured to cause one or more data processors part or all of one or more methods disclosed in claims 1-17.VII. Additional Considerations

[0130] Any headers and / or subheaders between sections and subsections of this document are included solely for the purpose of improving readability and do not imply that features cannot be combined across sections and subsection. Accordingly, sections and subsections do not describe separate embodiments.

[0131] While the present teachings are described in conjunction with various embodiments, it is not intended that the present teachings be limited to such embodiments. On the contrary, the present teachings encompass various alternatives, modifications, and equivalents, as will be appreciated by those of skill in the art. The present description provides preferred exemplary embodiments, and is not intended to limit the scope, applicability, or configuration of the disclosure. Rather, the present description of the preferred exemplary embodiments will provide those skilled in the art with an enabling description for implementing various embodiments. It is understood that various changes may be made in the function and arrangement of elements without departing from the spirit and scope as set forth in theAttorney Docket: 59868.78WO01appended claims. Thus, such modifications and variations are considered to be within the scope set forth in the appended claims. Further, the terms and expressions which have been employed are used as terms of description and not of limitation, and there is no intention in the use of such terms and expressions of excluding any equivalents of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the invention claimed.

[0132] In describing the various embodiments, the specification may have presented a method and / or process as a particular sequence of steps. However, to the extent that the method or process does not rely on the particular order of steps set forth herein, the method or process should not be limited to the particular sequence of steps described, and one skilled in the art can readily appreciate that the sequences may be varied and still remain within the spirit and scope of the various embodiments.

[0133] Some embodiments of the present disclosure include a system including one or more data processors. In some embodiments, the system includes a non-transitory computer readable storage medium containing instructions which, when executed on the one or more data processors, cause the one or more data processors to perform part or all of one or more methods and / or part or all of one or more processes disclosed herein. Some embodiments of the present disclosure include a computer-program product tangibly embodied in a non-transitory machine-readable storage medium, including instructions configured to cause one or more data processors to perform part or all of one or more methods and / or part or all of one or more processes disclosed herein.

[0134] Specific details are given in the present description to provide an understanding of the embodiments. However, it is understood that the embodiments may be practiced without these specific details. For example, circuits, systems, networks, processes, and other components may be shown as components in block diagram form in order not to obscure the embodiments in unnecessary detail. In other instances, well-known circuits, processes, algorithms, structures, and techniques may be shown without unnecessary detail in order to avoid obscuring the embodiments.

Claims

Attorney Docket: 59868.78WO01CLAIMSWhat we claim is:

1. A computer-based method comprising:receiving input data associated with a subject at a baseline timepoint, wherein the input data comprises at least characteristic data and baseline optical coherence tomography (OCT) imaging data;generating, using first and second machine learning models (MLs) using one or more processors, predicted optical coherence tomography (OCT) imaging data associated with the retina of the subject for a set of future timepoints, based on the input data; andpredicting, using the predicted OCT imaging data, values for a set of biomarkers for the set of future timepoints.

2. The method of claim 1, wherein:the characteristic data comprises subject age, sex, visit history, diagnosis of an ophthalmological condition, a treatment history, or a combination thereof; and the predicted OCT imaging data comprises one or more predicted features associated with predicted responses of the retina based on, at least in part, the selected future timepoint, the treatment schedule, and / or the baseline OCT imaging data.

3. The method of claim 1, the input data further comprises a selected future timepoint of the set of future timepoints and / or a treatment schedule associated with the retina of the subject, taking place over the set of future timepoints.

4. The method of claim 1, wherein:the set of biomarkers comprises a retinal layer element, a retinal pathological element, or a combination thereof; andAttorney Docket: 59868.78WO01the values for the set of biomarkers comprise layer thickness of a retinal layer element or fluid volume of a retinal pathological element.

5. The method of claim 1, further comprising:training the first ML model using a first training dataset that comprises example characteristic data inputs and correlated example OCT imaging data outputs, wherein the correlated example OCT imaging data outputs are preprocessed prior to use in the first training dataset.

6. The method of claim 5, wherein preprocessing the example OCT imaging data outputs comprises:performing a set of preprocessing operations on the example OCT imaging outputs, comprising at least one of masking a vitreous humor of the eye based on inner limiting membrane (ILM) detection, flattening a retinal layer, moving a height of the flattened layer to a fixed height, cropping a width and / or height of the B- scan, normalizing the B-scan, executing a pixel intensity normalization operation, a scaling operation, a resizing operation, a horizontal flipping operation, a vertical flipping operation, a cropping operation, a rotation operation, a noise filtering operation, or some other type of preprocessing operation.

7. The method of claim 1, wherein the second machine learning model comprises a trainable control branch.

8. The method of claim 7, further comprising:training the trainable control branch model using a second training dataset that comprises example input data inputs, corresponding example baseline OCT imaging data inputs, and correlated example future OCT imaging data outputs,Attorney Docket: 59868.78WO01wherein the corresponding example baseline OCT imaging data inputs, the correlated example OCT imaging data outputs, or a combination thereof are preprocessed prior to use in the second training dataset.

9. The method of claim 7, wherein preprocessing the corresponding example baseline OCT imaging data inputs, the correlated example OCT imaging data outputs, or a combination thereof comprises:performing a set of preprocessing operations on the OCT imaging data, comprising at least one of masking a vitreous humor of the eye based on inner limiting membrane (ILM) detection, flattening a retinal layer, moving a height of the flattened layer to a fixed height, cropping a width and / or height of the B-scan, normalizing the B-scan, executing a pixel intensity normalization operation, a scaling operation, a resizing operation, a horizontal flipping operation, a vertical flipping operation, a cropping operation, a rotation operation, a noise filtering operation, or some other type of preprocessing operation.

10. The method of claim 7, wherein the example future OCT imaging data outputs correspond to the example baseline OCT imaging data inputs.

11. The method of claim 1, wherein the ML comprises a diffusion model (DM).

12. The method of claim 11, wherein the trainable control branch comprises a ControlNet (CN) neural network configured to control the diffusion model (DM).

13. The method of claim 9 or claim 11, wherein the diffusion model is a latent diffusion model (LDM).

14. The method of claim 1, wherein the predicted optical coherence tomography (OCT) imaging data comprises a predicted future OCT image.Attorney Docket: 59868.78WO0115. The method of any one of claims 1-14, further comprising:identifying the subject, based on the predicted OCT imaging data, as high-risk of progressing to a predetermined stage of diabetic macular edema (DME) or neurovascular age-related macular degeneration (nAMD); and administering a therapy to the subject in response to the subject being identified as high-risk of progressing to the predetermined stage of DME or nAMD.

16. The method of any one of claims 1-14, further comprising:identifying the subject, based on the predicted OCT imaging data, as high-risk of progressing to a predetermined stage of diabetic macular edema (DME) or neurovascular age-related macular degeneration (nAMD); andincluding or excluding the subject from a stage of a clinical trial in response to the subject being identified as high-risk of progressing to the predetermined stage of DME or nAMD.

17. The method of any one of claims 1-14, whereinwherein the input data comprises a selected future timepoint of the set of future timepoints and a first treatment schedule associated with the retina of the subject, taking place over the set of future timepoints;wherein the first treatment schedule is for a first therapy;wherein the predicted OCT imaging data comprises one or more predicted features associated with predicted responses of the retina based on, at least in part, the selected future timepoint, the first treatment schedule, and the baseline OCT imaging data; andwherein the method further comprises:identifying the subject, based on the predicted OCT imaging data, as high-risk of progressing to a predetermined stage of DME nAMD; andAttorney Docket: 59868.78WO01administering a second therapy to the subject in response to the subject being identified as high-risk of progressing to the predetermined stage of DME or nAMD;wherein the second therapy is different from the first therapy.

18. A system comprising:one or more data processors; anda non-transitory computer readable storage medium containing instructions which, when executed on the one or more data processors, cause the one or more data processors to perform part or all of one or more methods disclosed in claims 1- 17.

19. A computer-program product tangibly embodied in a non-transitory machine-readable storage medium, including instructions configured to cause one or more data processors part or all of one or more methods disclosed in claims 1-17.