Deep neural network framework for processing oct images to predict treatment intensity
By analyzing OCT images and using deep convolutional neural networks to generate individualized treatment plans, the problem of determining the frequency of aVEGF drug injection has been solved, improving the treatment efficacy and safety of wet age-related macular degeneration.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- GENENTECH INC
- Filing Date
- 2020-12-04
- Publication Date
- 2026-06-26
Smart Images

Figure CN115039122B_ABST
Abstract
Description
[0001] Cross-reference to related applications
[0002] This application claims the interests and priorities of U.S. Provisional Application No. 62 / 944,815, filed December 6, 2019, and U.S. Provisional Application No. 63 / 017,898, filed April 30, 2020. Each of these applications is hereby incorporated herein by reference in its entirety for all purposes. Background Technology
[0003] Age-related macular degeneration (AMD) is a leading cause of vision loss in people over 60 years of age. For most people, AMD initially presents as dry AMD, which then progresses to wet AMD. In dry AMD, small deposits (drusen) form under the macula of the retina, eventually leading to retinal degeneration. In wet AMD, abnormal blood vessels grow towards the macula. These vessels often rupture and leak fluid, which can cause the macula to separate from its base, resulting in severe and acute vision loss.
[0004] Anti-vascular endothelial growth factor (aVEGF) agents are frequently used to treat wet AMD. Specifically, aVEGF agents can dry out the retina of a subject, allowing for better control of wet AMD and thus reducing or preventing permanent vision loss. However, aVEGF agents are administered via intravitreal injection, which is unpopular with subjects and carries potential side effects (e.g., red eye, eye pain, infection). Therefore, protocols exist to attempt to identify the minimum effective frequency for aVEGF injections. Most of these techniques resemble guesswork and testing methods.
[0005] One such technique is a treatment and extension protocol, in which the intervals between injections are gradually increased, provided that no new leakage is observed after the previous injection interval. A drawback of this method is that some subjects may experience new leakage before the injection frequency is increased to a sufficiently high level.
[0006] This is advantageous for identifying objective, individual-specific methods for determining an aVEGF injection schedule that is sufficient to effectively keep the eyes dry while avoiding over-injection. Summary of the Invention
[0007] Access an optical coherence tomography (OCT) image corresponding to the eye of a subject suffering from age-related macular degeneration (e.g., wet age-related macular degeneration). Identify a set of pixels within the OCT image corresponding to the retinal layer. Planarize the OCT image based on this set of pixels. Perform one or more cropping procedures using the planarized OCT image to produce one or more cropped images. Generate labels corresponding to features of a recommended treatment plan for the subject's eye using the one or more cropped images. Output the labels.
[0008] One or more cropped images may include multiple cropped images, each comprising a different block within a planarized OCT image. Generating labels may include, for each of the one or more cropped images, generating a block-specific result using a block-specific neural network. The block-specific neural network may have been trained with other images corresponding to the size of the cropped images. Processing one or more cropped images may further include processing the block-specific result using an ensemble model.
[0009] The markings may indicate the frequency of treatment administration (e.g., predicted to be sufficiently effective to prevent fluid leakage from blood vessels in the eye between consecutive treatment administrations) and / or the interval between consecutive treatment administrations (e.g., predicted to be sufficiently effective to prevent fluid leakage from blood vessels in the eye between consecutive treatment administrations). The characteristics of the proposed treatment plan may include the frequency of treatment administration (e.g., predicted to be sufficiently effective to prevent fluid leakage from blood vessels in the eye between consecutive treatment administrations) and / or may include the interval between consecutive treatment administrations (e.g., predicted to be sufficiently effective to prevent fluid leakage from blood vessels in the eye between consecutive treatment administrations).
[0010] The neural network may include a deep convolutional neural network with at least 5 convolutional blocks and fewer than 10,000 learnable parameters. The retinal layer within the retina (e.g., its description for planarization) may include the retinal pigment epithelium layer. The proposed treatment plan may include a proposed plan for administering anti-vascular endothelial growth factor. Labels may have been generated by feeding one or more cropped images into a neural network (e.g., including one or more convolutional neural networks and / or ensemble neural networks).
[0011] In some embodiments, a system is provided comprising: one or more data processors; and a non-transitory computer-readable storage medium containing instructions that, when executed on the one or more data processors, cause the one or more data processors to perform part or all of the methods disclosed herein.
[0012] In some embodiments, a computer program product is provided, tangibly embodied in a non-transitory machine-readable storage medium, and includes instructions configured to cause one or more data processors to perform part or all of the methods disclosed herein.
[0013] In some embodiments, a method is provided for treating the eye of a subject suffering from age-related macular degeneration (AMD). An OCT image depicting at least a portion of the eye of the subject suffering from AMD (e.g., wet AMD) is accessed. Processing of the OCT image is initiated using a machine learning model. This processing includes planarizing the OCT image and processing at least a portion of the planarized OCT image using a neural network. The result of the OCT image processing is accessed. This result indicates the characteristics of a recommended treatment plan for the subject's eye. The subject's eye is treated according to the recommended treatment plan. Treating the subject's eye according to the recommended treatment plan may include administering anti-vascular endothelial growth factor to the eye according to the recommended treatment plan.
[0014] Some embodiments of this disclosure include a system comprising one or more data processors. In some embodiments, the system includes a non-transitory computer-readable storage medium containing instructions that, when executed on the one or more data processors, cause the one or more data processors to perform part or all of one or more methods and / or part or all of one or more processes disclosed herein. Some embodiments of this disclosure include a computer program product tangibly embodied in a non-transitory machine-readable storage medium, comprising instructions configured to cause one or more data processors to perform part or all of one or more methods and / or part or all of one or more processes disclosed herein.
[0015] The terms and expressions used are descriptive rather than restrictive, and their use is not intended to exclude any equivalents of the features or portions thereof shown and described, but it should be recognized that various modifications can be made within the scope of the claimed invention. Therefore, it should be understood that while the claimed invention has been specifically disclosed by way of examples and optional features, modifications and variations of the concepts disclosed herein can be made by those skilled in the art, and such modifications and variations are considered to be within the scope of the invention as defined in the appended claims. Attached Figure Description
[0016] This disclosure is described in conjunction with the following figures:
[0017] Figure 1A block diagram of a network corresponding to some embodiments is shown, which is used to collect and analyze optical coherence tomography (OCT) images to predict treatment application plans that are effective for treating eye diseases.
[0018] Figure 2 A flowchart of an exemplary process is shown, which is used to process OCT images using a deep neural network to generate markers corresponding to a treatment administration plan.
[0019] Figure 3A , Figure 3B and Figure 3C Multiple 2D OCT images were shown that correlated with different depths of the subject's eye.
[0020] Figure 4A and Figure 4B The images show both unplanarized and planarized OCT images.
[0021] Figure 5 This demonstrates a deep learning workflow for training and using deep learning networks to process data.
[0022] Figure 6A and Figure 6B An aggregated receiver operating characteristic (ROC) curve based on labels generated by a dense neural network (associated with a first labeling framework) is shown, which predicts whether a single OCT image corresponds to an instance where the relevant eye will receive either high-intensity aVEGF treatment or low-intensity aVEGF treatment, respectively.
[0023] Figure 6C The ROC is shown to characterize the accuracy of treatment intensity predictions (associated with the first labeled frame) generated by processing OCT images using a random forest model.
[0024] Figure 7A and Figure 7B The diagram illustrates a ROC based on labels generated by a dense neural network (associated with a second labeling framework), which predicts whether various OCT images correspond to instances where the relevant eyes will receive either high-intensity or low-intensity aVEGF treatment, respectively.
[0025] Figure 7C The ROC is shown to characterize the accuracy of treatment intensity predictions (associated with a second labeled frame) generated by processing OCT images using a random forest model.
[0026] In the accompanying drawings, similar parts and / or features may have the same reference numerals. Furthermore, various parts of the same type can be distinguished by adding a dash after the reference numeral and a second reference numeral to differentiate similar parts. If only the first reference numeral is used in the description, the description applies to any similar parts having the same first reference numeral, regardless of the second reference numeral. Detailed Implementation
[0027] I. Overview
[0028] This description relates to the characteristics of predicting a treatment plan for a given subject and a given eye based on the processing of optical coherence tomography (OCT) images. The treatment plan may indicate when multiple administrations of an anti-vascular endothelial growth factor (aVEGF) will be administered. aVEGF agents may include, for example, ranibizumab or bevacizumab. aVEGF treatment characteristics may indicate, for example, the frequency of treatment administration, one or more time periods between consecutive treatment administrations, or the number of treatments to be administered within a given time period. In some instances, aVEGF treatment characteristics may alternatively or additionally identify the dose of the active ingredient to be administered.
[0029] A given eye may have been diagnosed with (e.g., may have been diagnosed with) macular degeneration, such as age-related macular degeneration and / or wet age-related macular degeneration.
[0030] In one embodiment, the OCT image is preprocessed to flatten the image based on a depiction of a specific biological structure, such as the retinal pigment epithelium. The preprocessing further includes cropping the flattened image to exclude portions of the image that are relatively far removed from the straightened depiction of the retinal pigment epithelium.
[0031] The preprocessed image is then fed into a trained neural network (e.g., a deep neural network and / or a convolutional neural network) that generates tags corresponding to treatment plans predicted to be effective for the eye (e.g., effective for treating age-related macular degeneration, such as wet age-related macular degeneration). More specifically, the tags may indicate characteristics of the treatment plan predicted to be effective in preventing vascular leakage between consecutive administrations of the therapeutic agent (e.g., consecutive injections of aVEGF).
[0032] The marker may include one or more numbers (e.g., identifying the frequency of recommended treatment administration, the count of recommended treatment administrations within a predefined time period, or the recommended interval between consecutive treatment administrations), one or more categories (e.g., “low”; “medium”; or “high” frequency identifiers), and / or one or more binary indicators (“low” or “non-low” frequency identifiers). For example, a “low” marker may predict that a treatment administration count or frequency equal to (or possibly even lower than) a predefined threshold or value will be effective in preventing leakage between treatment administrations for a given subject’s eye. As another example, a “high” marker may indicate that leakage between treatment administrations is predicted for a given subject’s eye unless high-frequency treatments are used (e.g., defined using a treatment administration count or frequency threshold or value). The threshold may be defined as an absolute value and a time period (e.g., such that a low marker is assigned if 5 or fewer treatments are administered within a 20-month time period; a high marker is assigned if 15 or more treatments are administered within a 20-month time period; and a medium marker is assigned if between 6 and 14 treatments are administered within a 20-month time period). In some instances, a threshold for the frequency of treatment administration is used to define the label (e.g., such that if treatment is administered at a rate of less than once every three months on average, a “low” label is assigned; and if treatment is administered at a rate of at least once per month on average, a “high” label is assigned).
[0033] Each label can be generated via an activation layer in a neural network.
[0034] In some instances, a set of treatment administration schedules is defined. For example, a first schedule might indicate that the dose will be delivered at 1 month, 2 months, 4 months, 6 months, 9 months, and 12 months from baseline; and a second schedule might indicate that the dose will be delivered at 1 month, 2 months, 3 months, 4.5 months, 6 months, 8 months, 10 months, and 12 months. A marker can then identify a specific one within this set of treatment administration schedules.
[0035] This label can indicate the characteristics of a maintenance treatment plan following the administration of initial (or pilot) treatment. For example, initial treatment can be defined as consisting of a specific number of treatment administrations administered according to a particular schedule (e.g., monthly for three months). Initial treatment may, but does not have to, use the same type of therapeutic agent and the same specific schedule across subjects. Maintenance treatment may, but does not have to, use the same or different therapeutic agent as used in initial treatment.
[0036] Neural networks can include deep networks that may contain multiple convolutional blocks (e.g., 10 convolutional blocks), which can exponentially increase the network's expressiveness. Neural networks can further or alternatively be very thin (e.g., having fewer than 6,000 learnable parameters), which allows them to be trained with relatively few computational resources and quickly (e.g., less than one minute per epoch). Neural networks can include fully convolutional neural networks that are invariant to the size of the input space.
[0037] A neural network can be trained using a set of preprocessed training OCT images in blocks. Data augmentation can be performed by extracting multiple blocks (e.g., of different sizes and / or different relative positions) from a single OCT image. In some instances, a single neural network (e.g., a flexible CNN) is trained using blocks of different sizes. In other instances, each of multiple neural networks is associated with a given block size and / or a given relative block position and is trained using images of the given block size and / or given relative block position. Neural networks can include ensemble models that are built and trained to aggregate and process the results from multiple block-size-specific neural networks. Cross-validation techniques can be used concurrently with training the neural network. For example, cross-validation techniques can include 5-fold cross-validation or Monte Carlo cross-validation, which may have advantages in terms of scalability, fault tolerance, pooling, and support for distributed model training, selection, and / or evaluation.
[0038] The system can output tags that identify characteristics of a treatment plan for use by care providers. In some instances, the output may indicate (or otherwise represent) that the identified treatment plan characteristics are those likely to be associated with a maintenance therapy period. Care providers can use the tags to inform their choice of treatment methods. For example, a care provider may recommend and / or prescribe treatment plans with the treatment plan characteristics indicated by the tags (e.g., aVEGF).
[0039] II. Definition
[0040] As used herein, “effective” treatment for AMD can include a situation where no new vascular leakage or visual acuity deterioration is observed between administered doses (e.g., treatment administered according to a specific treatment plan). In some instances, effective treatment includes minimum effective treatment. For example, it should be understood that multiple treatment plans may be effective for treating AMD, in which case minimum effective treatment may be used, including, for example, the minimum amount of treatment administered over a period of time, the minimum frequency of treatment administration over a period of time, the longest average duration between consecutive treatment administrations, and the minimum and total dose of treatment administered over a period of time.
[0041] As used herein, a treatment “plan” indicates when each of a plurality of doses of a particular treatment will be administered. A treatment plan may identify, for example, a set of dates, one or more time intervals between doses, or a frequency of administration. For example, a treatment plan may indicate that a particular treatment will be administered once a month. In some instances, a treatment plan includes one or more ranges. For example, a treatment plan may identify an administration interval that indicates that each of one or more treatment doses will be administered at some time between 7 and 9 weeks after a previous treatment administration.
[0042] III. Generating a network of treatment plan markers using OCT images
[0043] Figure 1 A block diagram of a network 100 according to some embodiments of the present invention is shown, the network being used to collect and analyze optical coherence tomography (OCT) images to predict treatment administration plans effective for treating eye diseases. Network 100 includes one or more imaging systems 105 configured to collect one or more images, each image depicting at least a portion of a subject's eye.
[0044] Imaging system 105 can be configured to acquire optical coherence tomography (OCT) images using OCT imaging technology. OCT is a non-invasive technique that uses light waves to construct cross-sectional images of the eye. Imaging system 105 may include an interferometer (e.g., a light source beam splitter and a reference mirror). For example, the light source can generate a light beam (e.g., a low-coherence near-infrared beam), which can be split by the beam splitter. A first portion of the split beam can be directed to the subject's eye, and a second portion of the split beam can be directed to the reference mirror. Backscattered light from the eye and from the reference mirror can be combined, and the combined light can be analyzed to measure interference. Areas of target tissue that reflect more light may result in more interference.
[0045] Each scan can be generated by laterally guiding the beam to produce interference information associated with multiple locations. Each of the set of A-scans can then be defined as corresponding to a depth associated with a particular scan. Each A-scan can be a one-dimensional scan. The set of A-scans (e.g., 128, 256, or 512 A-scans) can then be aligned with each other to produce a two-dimensional B-scan.
[0046] B-scans can depict, for example, at least a portion of the retina, macula, and / or optic nerve. The thickness between specific layers in the eye and / or the shape between various layers can indicate whether blood vessels have ruptured and leaked into the eye (e.g., the retina, subretinal, or subpigmental epithelial space).
[0047] The imaging system 105 may further include one or more processors and / or one or more memories to utilize computational actions. For example, computational actions may include generating B-scans using multiple A-scans, normalizing intensity values, changing resolution, and / or applying one or more filters. It should be understood that multiple B-scans can be generated for a given eye (e.g., based on multiple sets of A-scans). Each of the multiple B-scans may be associated with a different depth.
[0048] Imaging system 105 may transmit and / or otherwise utilize images to OCT image processing controller 110. For example, imaging system 105 may upload images associated with one or more identifiers to a remote data storage device that is partially or fully accessible to OCT image processing controller 110. As another example, imaging system 105 may transmit one or more images of the eye to client system 115, which then transmits or utilizes the images to OCT image processing controller 110. Client system 115 may be a computing system associated with a care provider (e.g., physician, hospital, ophthalmologist, medical technician, etc.) providing care to a subject whose eye has been imaged at imaging system 105.
[0049] OCT image processing controller 110 may include preprocessing controller 120 configured to preprocess images (e.g., one or more B-scans). Preprocessing may be performed to mitigate variations in the images caused by various types of machines and / or environments. For example, preprocessing may include changing the image resolution, changing the image scaling, and / or changing the intensity distribution of the image (e.g., by applying normalization or standardization techniques).
[0050] Due to the natural curvature of the eye, B-scans may depict curved layers. Therefore, preprocessing may include planarizing the image. The planarized image may include an initial estimate of the location of a given structure. This location may be defined as a set of pixels. The structure may include the retinal pigment epithelium.
[0051] The detection structure may include segmentation. The detection structure may alternatively or additionally include applying filters (e.g., Gaussian filters) to denoise the image. Then, within each column, one or more pixels associated with the highest intensity across that column can be initially identified as corresponding to the structure. A smoothing function can be used to facilitate the selection of fairly contiguous pixels across columns. Planarization can then be achieved by offsetting the columns relative to each other, such that the selected pixels (e.g., associated with the highest intensity) are aligned in a row.
[0052] In some instances, planarization may include segmenting biological structures (e.g., by applying a filter and then thresholding the filtered image), and then fitting a function to the segmented pixels. This function may include a spline function. The columns of the image may be offset relative to each other to planarize the spline function.
[0053] Preprocessing may include cropping a portion of the planarized image. Cropping may be performed to produce an image of the target size. Cropping may be performed to remove the top and / or bottom of the image. For example, cropping may be performed to remove all pixels above the planarized biological structure by an amplitude exceeding a first threshold and / or to remove all pixels below the planarized biological structure by an amplitude exceeding a second threshold. In some instances, pixel intensity alteration (e.g., normalization or standardization) is performed after planarization.
[0054] At least some of the images can be used to train a neural network, along with corresponding labels. The corresponding labels can indicate one or more features of a treatment applied within a time period following image collection. In other words, for each image in the training dataset, its collection date can be defined as the baseline time. The time period characterizing the treatment can begin, for example, at the baseline time, one month after the baseline time, two months after the baseline time, or three months after the baseline time. In some instances, the initial treatment is initiated shortly after or before the baseline time, and the monitored time period is defined as beginning after the initial treatment has been completed.
[0055] Treatment data storage 125 may include information for identifying the markers or may include the markers themselves. For example, for each of a group of subjects, treatment data storage 125 may include a record that includes an identifier associated with the subject or image, as well as observed treatment information. The observed treatment information may identify the type of treatment administered, the date of treatment administration, the interval between treatment administrations, and / or the amount of treatment administered within that time period.
[0056] OCT image processing controller 110 may include OCT image processing training controller 130, which has access to training data including baseline images and corresponding treatment label data. In some instances, OCT image processing training controller 130 may generate labels to associate with each training data baseline image. Labels may be generated based on treatment data (e.g., from treatment data storage 125) corresponding to an identifier associated with the baseline image. For example, a label may identify the amount of treatment (e.g., of a specific type) administered within a monitored time period (e.g., by querying the treatment administration date within the corresponding time period and associated with the identifier corresponding to the baseline image). As another example, a label may identify the interval between the last two treatment administrations or the average (or median) interval between multiple consecutive pairs of treatment administrations. It should be understood that in some instances, treatment data storage 125 stores the labels themselves.
[0057] The OCT image processing training controller 130 can train one or more neural networks using preprocessed images and labels associated with a training dataset. The neural networks may include one or more deep neural networks and / or one or more convolutional neural networks. Each of the neural networks may be configured to include at least one, at least two, at least five, at least ten, at least fifteen, or at least twenty convolutional blocks. Each of the one or more deep convolutional neural networks may be a thin neural network with fewer than 20,000, fewer than 10,000, fewer than 6,000, or fewer than 3,000 learnable parameters.
[0058] In some instances, a neural network may include multiple neural networks, each trained to process images of different sizes and / or each trained to process blocks corresponding to different locations. Thus, in some instances, initial preprocessing is performed at the image level (e.g., to planarize and crop the image). Subsequent preprocessing may be performed to prepare input for a specific neural network by, for example, extracting specific blocks from the planarized and cropped image, where the size and location of the blocks are determined based on metadata associated with the specific neural data. For example, the multiple neural networks may include a first set of neural networks trained to process blocks with 128x128 pixels; a second set of neural networks trained to process blocks with 256x256 pixels; a third set of neural networks trained to process blocks with 512x512 pixels; a fourth set of neural networks trained to process blocks with 1024x1024 pixels; and another neural network trained to process the entire image (a network based on the low-level image). Each of a given neural network group (e.g., group 1, group 2, etc.) can be associated with a given location in an image region, wherein the region associated with group 1 can tile the image region in an overlapping or non-overlapping manner.
[0059] The outputs from multiple neural networks can be fed into another neural network (e.g., a committee machine), which can then integrate the results so that the neural networks work together as an ensemble model. For example, another neural network can be configured to learn weights to apply to the outputs of various neural networks. In some instances, the ensemble network learns individual weights to apply to each output of a given low-level neural network (e.g., associated with a specific block). In some instances, the ensemble network learns more complex relationships where the weights applied to the output of a given neural network may depend on (e.g.) the output from the given neural network and / or the outputs from each of one or more other neural networks. In some instances, each block-specific neural network is configured to generate both an output and a confidence metric. The ensemble network can then (further or alternatively) determine the weights to be applied to a given output from the given low-level neural network, based at least in part on the confidence metric from the given low-level neural network and / or on the confidence metric from one or more other low-level neural networks.
[0060] The OCT image processing training controller 130 can be configured to co-train all neural networks (e.g., all block-specific neural networks and other ensemble neural networks). Alternatively, the OCT image processing training controller 130 can train each block-specific network, low-level image-based network, and other ensemble networks individually. In some instances, independent training is performed to initialize the parameters in each model, and then co-training is performed.
[0061] The OCT image processing controller 110 may include a treatment intensity generator 135 that uses a trained neural network to generate results corresponding to OCT images not included in the training data. The OCT images may correspond to subjects and / or eyes not represented in the training data. The imaged eye may include an eye diagnosed with age-related macular degeneration and / or wet age-related macular degeneration.
[0062] OCT images may include a preprocessed image, which may include remote preprocessing (e.g., at imaging system 105) and / or preprocessing performed at preprocessing controller 120. Preprocessing may include one or more preprocessing techniques disclosed herein, such as using multiple A scans to generate B scans, planarizing B scan images, cropping the planarized image, and / or adjusting (e.g., normalizing or standardizing) the intensity.
[0063] The treatment intensity generator 135 can then feed the preprocessed OCT image into a trained neural network to generate an output corresponding to features of a predicted treatment plan effective for the treated eye (e.g., sufficient to prevent leakage in blood vessels between treatment applications). In some instances, the output corresponds to features of a treatment plan predicted to include a minimum amount of treatment applied across a time interval (and / or the longest duration between treatment applications) that will effectively treat the eye. The output can identify, for example, the amount of treatment to be applied within a predefined time period (e.g., a specific treatment, such as a specific aVEGF treatment); the frequency of treatment to be applied (e.g., a specific type of treatment); and the interval used to separate consecutive applications of treatment (e.g., represented as a given number or range with units). In some instances, the treatment intensity generator 135 applies one or more post-processing techniques to transform the output from the neural network into a result. For example, the output can identify a target frequency of treatment application, and the post-processing can transform the output into a set of dates (or date ranges) for treatments to be applied.
[0064] The result can be returned to client device 115. Client device 115 may be associated with (e.g., owned, used, controlled, and / or operated by) an entity that provides medical care to a subject whose eye is at least partially depicted in the analyzed OCT image. For example, the entity may include a doctor, a doctor's office, or an ophthalmologist. In some instances, client device 115 initially provides the OCT image to OCT image processing controller 110. In some instances, client device 115 initiates and / or completes a request to imaging system 105 to collect OCT images.
[0065] The results indicating the characteristics of a treatment plan predicted to provide effective treatment for the eye may be used by the care-providing entity to inform the selection and / or definition of a treatment plan to be prescribed and / or recommended for the subject.
[0066] IV. Machine Learning Process for Generating Treatment Plan Labels Using OCT Images
[0067] Figure 2 A flowchart is shown for a process 200 for processing OCT images using a deep neural network to generate labels corresponding to a treatment administration plan. At box 205, a set of OCT training images is accessed. Each OCT training image may have been collected from the eye, and the subject has been diagnosed with age-related macular degeneration (e.g., wet age-related macular degeneration). Each OCT training image may be further associated with past data indicating the choice of treatment plan (e.g., frequency of aVEGF treatment administration) and / or treatment outcome (e.g., indicating the efficacy of treatment administered according to a specific treatment plan). Each of the OCT training images may have been associated with data indicating the subsequent subject status observed in response to a treatment regimen associated with a specific treatment (e.g., a specific therapeutic agent).
[0068] The training dataset can include multiple 2D images corresponding to a single subject's eye. For example, multiple 2D images can correspond to different depths.
[0069] At box 210, each of the OCT training images is planarized. Planarization can be performed to warp the image so that the depiction of a particular biological structure (e.g., the retinal layer, such as the retinal pigment epithelium) is substantially planar in the planarized image. Other parts of the image (e.g., pixels) can be warped to adjust their positions based on the planarization.
[0070] Specific biological structures can be identified based on metadata associated with the image (e.g., identifying pixels associated with layers) and / or by performing computer vision techniques (e.g., detecting and / or characterizing edges). In some instances, biological structures are identified during image acquisition and can be detected within an OCT scanner. Figure 4A A portion of the unplanarized OCT training image is shown, and Figure 4BA portion of the planarized OCT training images is shown. It should be understood that each individual OCT training image (and each planarized OCT training image) can comprise a 2D image. The bottom white layer in each image is the retinal pigment epithelium, which serves as the basis for planarization. Therefore, one approach for identifying specific target biological structures is to define a target range of intensity values (e.g., open or closed ranges) to detect pixels with intensity values within that range, and then define lines (e.g., curves) based on those detected pixels. The planarized OCT images are further cropped to produce planarized and cropped OCT images, as shown below. Figure 4B As shown.
[0071] At box 215, each planarized OCT training image is cropped. Cropping may be performed to remove the top and / or bottom of the planarized image. The removed portion may lack depiction of a part of the retina and / or may have intensity statistics below a predefined threshold (e.g., average intensity, median intensity, or intensity variability). In some instances, the same or subsequent cropping is performed to generate blocks of planarized images (e.g., with a width smaller than the minimum or maximum width of the planarized and possibly initially cropped OCT training image). Cropping may be performed to produce images of a predefined size. In some instances, multiple biomarkers are detected, and cropping is performed to scale the training images to a default scale.
[0072] For example, collecting images by using a Zeiss Cirrus machine to image a single eye. Figures 3A to 3C Each of the images depicted in the text. Figures 3A to 3C Each of these corresponds to a central B-scan at a different depth. Furthermore, the depicted image includes a planarized image (performed planarization on the retinal pigment epithelium layer) and a cropped image. Cropping is performed to include a region extending from 128 pixels below the planarized RPE layer to 384 pixels above the planarized RPE layer.
[0073] At box 220, for each of the OCT training images, a training marker characterizing a treatment administration event (e.g., intravitreal injection event) is identified. The training marker may be based on the treatment selected for the subject's eye (e.g., used after collecting the OCT training images and / or at a specific predetermined time point after collecting the OCT training images).
[0074] Training markers may include characteristics of the selected treatment or the outcome of the selected treatment. For example, training markers may identify the agent used for treatment, the dose of a specific therapeutic agent being administered, or the instantaneous characteristics of treatment administration (e.g., the frequency or cycle of drug administration or the time period between administrations). As an example, an agent may include any aVEGF agent or a specific aVEGF agent. As another example, training markers may identify whether a specific treatment or a specific type of treatment (and / or a specific dose of a specific treatment or a specific type of treatment) is sufficiently effective to prevent the observation of a specific event (e.g., no new vascular leakage) or to observe a specific event (e.g., elimination of potential macular effusion secondary to wet age-related macular degeneration) or to prevent the observation of a specific event (e.g., leakage between consecutive treatment uses). As an additional or alternative example, training markers may indicate whether a specific treatment plan (e.g., the specific plan may indicate the relative timing of multiple treatments to be administered and the potential doses of multiple treatments) is sufficiently effective to prevent the observation of a specific event (e.g., no new vascular leakage) or to identify a specific event (e.g., elimination of potential macular effusion secondary to wet age-related macular degeneration).
[0075] Treatment markers can indicate characteristics of treatments administered using methods in which the interval between consecutive treatment administrations decreases with each observed leakage, or treatment is administered after each leakage observation. In some instances, markers can indicate the amount of treatment administered within a given time period (e.g., possibly normalized by the duration of that time period to indicate frequency). This time period can include, for example, a period beginning from a baseline date and / or a date following the initial treatment period. The time period can have a defined duration (e.g., regarding training data).
[0076] In some instances, the markers may be based, in part or in part, on one or more treatment modifications (or their absence). For example, a default treatment regimen may have been selected, and the trained markers may reflect whether modifications to the default treatment regimen were used and / or which modifications to the default treatment regimen were used to treat the eye. In some instances, the markers identify the number or frequency of treatments of a given type (e.g., and dosage) administered within a given time period.
[0077] At box 225, a deep neural network is trained using cropped and planarized training images and corresponding training labels. In some instances, box 225 includes training each of one or more deep neural networks using at least some of the cropped and planarized training images and corresponding training labels. For example, a set of deep neural networks can be used, where each of the set of deep neural networks is trained using patches from cropped and planarized training images of a specific size (e.g., such that different patch sizes are handled by different networks). Each of the set of deep neural networks may have a similar or identical architecture, but can be trained using different portions of the training data, such that different networks learn different parameter values. Ensemble models can be used to summarize and process results from multiple lower-level neural networks with patch sizes specific.
[0078] Each of the one or more deep convolutional neural networks may include at least one, at least two, at least five, at least ten, at least fifteen, or at least twenty convolutional blocks. Each of the one or more deep convolutional neural networks may be a thin neural network with fewer than 20,000, fewer than 10,000, fewer than 6,000, or fewer than 3,000 learnable parameters.
[0079] It should be understood that in some instances, cross-validation techniques (e.g., nested cross-validation) can be used to select hyperparameters and / or estimate error. More specifically, cross-validation can be used instead of dividing the training dataset into training, validation, and test subsets to estimate generalization ability. The latter technique can involve developing a model with training data, selecting an optimized model with validation data, and evaluating the performance of the selected optimized model with test data. Meanwhile, as... Figure 5 As shown, the nested cross-validation method can use multi-level cross-validation iterations (i.e., outer loop 505 and inner loop 510) to achieve data splitting and evaluation of the model's generalization ability.
[0080] In the outer loop 505, the training data can be split into test data and a combination of training and validation data (e.g., according to a five-fold splitting method). In each iteration, the training and validation data can be further separated in the inner loop 510. The training data can be used to train one or more fully convolutional neural networks. In some instances, the training data is used to train a single fully convolutional neural network. Each of the fully convolutional neural networks can include a relatively small number of parameters (e.g., fewer than 10,000, fewer than 8,000, or fewer than 5,000). Validation data can be used to select hyperparameters for the model (e.g., determine optimal hyperparameters). In the depicted instance, validation data is used to determine a specific stopping cycle (e.g., optimize the stopping cycle). For a given split of training and validation data, a specific stopping cycle can be determined by selecting the cycle with the lowest validation loss (among multiple cycles used during training).
[0081] Training and validation splits can be repeated, allowing different portions of the data to be assigned to the training data in each iteration. The final specific stopping period (t* in the figure) can be a statistic generated based on the identified specific stopping periods for each split (e.g., the average of specific stopping periods across splits). The combination of training and validation data can be used to train multiple models with the final specific stopping period. Each of the multiple models may have the same architecture, but may include different learned parameters due to training with training data elements in different orders. These multiple models can form a model ensemble or ensemble model 515. The ensemble model can be evaluated on the already retained test data.
[0082] At box 230, the input image can be planarized and cropped. The input image may include an OCT image associated with the eye and a subject suffering from age-related macular degeneration (e.g., wet age-related macular degeneration). In some instances, the input image includes an image not represented in the training data. Planarization can be performed using the same or similar techniques applied at box 210 to planarize the training image. Cropping can be performed using the same or similar techniques applied at box 215. In some instances, cropping is performed to potentially remove the top and / or bottom of areas that do not depict the target region (e.g., a portion of the retina). Cropping may, but does not necessarily, extend to generating one or more data blocks (e.g., regardless of whether blocks are used to train the model).
[0083] At box 235, a trained deep neural network is used to process the planarized and cropped input image to generate labels. These labels may correspond to features of a treatment administration plan. The labels may identify one or more features of the treatment administration plan (e.g., associated with the administration of an aVEGF agent) that are predicted to be effective for the eye depicted in the treatment input image (e.g., eliminating potential macular effusion secondary to wet age-related macular degeneration) and / or preventing disease progression (e.g., preventing progression to wet age-related macular degeneration).
[0084] This marker may alternatively or additionally correspond to a prediction of whether a treatment administration plan (e.g., associated with the administration of an aVEGF plan) will be effective in treating the eye and / or preventing a particular type of progression.
[0085] At box 240, the output corresponds to a tag for the treatment administration plan. For example, the tag can be presented or transmitted (presented at or transmitted to a device associated with a healthcare provider). The tag can be output along with other information about the subject, such as the subject's name and / or diagnosis date.
[0086] V. Example 1
[0087] The deep learning model was used to analyze OCT images collected by imaging the retina diagnosed with age-related macular degeneration to generate eye- and subject-specific predictions about which type of aVEGF treatment would be provided to the depicted eye for each subject.
[0088] VA method
[0089] During the loading phase, 1042 OCT images for nAMD were captured from HARBOR's PRN arm. Specifically, ocular OCT was used to assess the study from 352 subjects and correlated with the burden of anti-VEGF treatment administration over 23 months. Low anti-VEGF treatment requirement was defined as 5 or fewer injections between the 21 visits following the loading phase and between the completion of 3 consecutive monthly administrations of ranibizumab and the 23-month visit. (See Bogunovic H et al. Prediction of anti-VEGF treatment requirements in neovascular AMD using a machine learning approach. Invest Ophthalmol Vis Sci. 2017; 58(7):3240-3248, which is incorporated herein by reference in its entirety for all purposes.) A stratified 5-part model was created at the subject level for nested cross-validation.
[0090] OCT images (1024×512×128 resolution) from a Zeiss Cirrus machine are planarized (or re-baselined) towards a planarized retinal pigment epithelium (RPE) layer and cropped with 384 pixels above and 128 pixels below the RPE. A central 15-fold B scan is selected. Random cropping (e.g., using a random process to determine the crop location and the size that might be used for cropping) is applied to randomized or pre-specified sized sample training and validation blocks.
[0091] The deep learning network is designed to contain 10 convolutional blocks to exponentially improve performance with fewer than 5600 weight parameters. The deep learning network includes fully convolutional layers. This design facilitates fast and computationally inexpensive training. The F-CNN (Fully Convolutional Neural Network) architecture is applied to take blocks of arbitrary size as input and make predictions across the entire slice. A committee machine is used as an ensemble model. Each committee member model generates block-level predictions, and the committee machine aggregates the block-level predictions to generate image-level results (e.g., by determining the average of the block-level predictions). The image-level results correspond to binary predictions about whether a specific treatment intensity (e.g., high anti-VEGF therapy, or in other cases, low anti-VEGF therapy) will be administered. Low anti-VEGF therapy is defined as five or fewer anti-VEGF therapy administrations during a 20-month period between the 3-month period of the guided procedure with monthly administration of ranibizumab and the 23-month visit.
[0092] VB Results
[0093] Of the 547 eyes used in the PRN study, 352 unique study eyes were suitable for analysis. Among these, the model predicted that 79 (22.4%) eyes would be classified as having a disease that could be effectively treated with only low-dose anti-VEGF therapy.
[0094] 1042 OCT scans from the loading phase were used for modeling. For each scan, observed treatments were categorized as low, intermediate, or high anti-VEGF treatments, and predicted treatments were similarly categorized. The model achieved an area under the receiver operating characteristic curve (AUROC) of 78.6% (ranging from 72.7% to 84.4%).
[0095] VC Conclusion
[0096] Deep learning models have demonstrated strong performance in predicting which type of anti-VEGF therapy (e.g., low, intermediate, or high) an individual nAMD subject will receive after the ranibizumab loading phase. These types of results can help physicians / subjects understand their future treatment burden. These results can be further or alternatively used for stratifying subjects in future clinical studies aimed at assessing the durability of anti-VEGF therapy administration.
[0097] VI. Example 2
[0098] Another analysis was conducted using a different set of OCT images collected by imaging the retina diagnosed with age-related macular degeneration. Deep learning models were used to generate eye- and subject-specific predictions regarding which type of aVEGF treatment would subsequently be provided to the depicted eye for each subject.
[0099] VI.A. Method
[0100] A deep learning model was trained using 1069 OCT images collected from 362 subjects. This training data was repeatedly split into a first subset for training the model and a second subset for validating it. During each of these data splits, the accuracy of the model's predictions against the validation data was determined. When evaluating predictions generated using the validation subset, average and summary statistics (e.g., average AUROC statistics and summary AUROC statistics) were then generated by assessing the accuracy of predictions across the data splits. A reserved set (not used for training) comprised 183 OCT images from 62 subjects. The burden of anti-VEGF treatment administration was assessed using each of the following two metric frameworks:
[0101] • The first “MUV” framework: Low anti-VEGF treatment requirement is defined as 5 or fewer treatment administrations between the completion of 3 consecutive monthly ranibizumab injections and the 23-month visit. High anti-VEGF treatment requirement is defined as 16 or more aVEGF treatment administrations between the completion of 3 consecutive monthly ranibizumab treatment administrations and the 23-month visit.
[0102] • The second “Turing” framework: Low anti-VEGF treatment requirement is defined as two or fewer aVEGF treatment administrations between the completion of three consecutive monthly administrations of ranibizumab and the 12-month visit. High anti-VEGF treatment requirement is defined as nine or more aVEGF treatment administrations between the completion of three consecutive monthly administrations of ranibizumab and the 12-month visit.
[0103] The OCT images were preprocessed using the same type of planarization and cropping scheme as described in Example 1. The deep learning dense network was designed to be configured as described in Example 1.
[0104] VI.B. Results
[0105] Figures 6A to 6B The Collective Cross-Validation Receiver Operating Characteristic (ROC) curve is shown, which characterizes the accuracy of predictions made by the first MUV frame labels generated by the dense network. The data represent the accuracy characteristics evaluated from validation data. Figure 6A The accuracy of predictions regarding whether a given eye-subject would receive high levels of anti-VEGF treatment was characterized. To generate the ROC, a threshold was varied within a certain range. For each image (e.g., in a split or preserved dataset) and for each threshold, predictions were generated regarding whether the subject received a specific treatment plan considered a "positive" instance based on whether the outcome generated by the deep learning model exceeded the threshold. For each dataset (e.g., a split dataset of the preserved dataset), the sensitivity metric was calculated as the number of true positives divided by the sum of true positives and false negatives. Further, for each dataset, the specificity metric was calculated as the number of true negatives divided by the sum of true negatives and false positives.
[0106] Ideally, both sensitivity and specificity metrics should be close to 1.0 (or equal to 1.0 for at least one threshold). In this example, plotting the receiver operating characteristic (ROC) curve for sensitivity versus specificity across thresholds would include one or more points near or located at (100, 100), and the curve would not follow a single line. One technique for measuring how well the ROC possesses these properties is to determine the area under the curve. An area under the curve close to or equal to 100% indicates that, for at least one threshold, the model is capable of successfully predicting both positive and negative instances to be used.
[0107] Figure 6A This corresponds to one analysis where a "positive" instance is defined as an instance where a subject received high-intensity anti-VEGF treatment, and high-intensity anti-VEGF treatment is defined according to the MUV labeling definition above. ROC curves were calculated for each of the 10 data splits. Figure 6A The mean ROC and ROC distribution are shown. The mean AUROC (calculated by averaging the AUROC calculated for each of the 10 data splits) is 0.77, and the aggregated AUROC (calculated by first averaging the ROC of the data splits, and then using the mean ROC to calculate the AUROC) is 0.76.
[0108] Figure 6BThis corresponds to an analysis where a "positive" instance is defined as an instance where a subject received low-intensity anti-VEGF treatment (as defined by the MUV framework). The plotted graph characterizes the accuracy of predictions regarding whether a given eye-subject would receive low-intensity anti-VEGF treatment. The mean AUROC is 0.80, and the aggregated AUROC is 0.79.
[0109] Figure 6C The receiver operating characteristic curves (AUROCs) for predicting high and low treatment intensity using extracted features from a random forest model are shown (e.g., where 10-fold cross-validation is used to generate a binary prediction about whether a specific intensity of treatment is administered). As shown, the AUROCs for predicting high and low intensity treatment are 0.77 and 0.70, respectively, both lower than those corresponding to predictions from dense neural network techniques (with...). Figures 6A to 6B (Related to the data in the middle).
[0110] Figures 7A to 7B The ROC is shown to characterize the accuracy of predictions using the second Turing frame label. Figure 7A The accuracy of predictions regarding whether a given eye-subject would have high levels of anti-VEGF treatment (as defined using the second Turing framework) was characterized. The mean AUROC was 0.73, and the aggregate AUROC was 0.75 (calculated via 10-fold cross-validation). Figure 7B The accuracy of predictions regarding whether a given eye-subject would have low anti-VEGF treatment (as defined using the second Turing framework) was characterized. The mean AUROC was 0.69, and the aggregate AUROC was 0.69 (calculated via 10-fold cross-validation). Figure 7C The receiver operating characteristic (AUROC) curves for predicting high-intensity and low-intensity treatments using extracted features from a random forest model are shown. As illustrated, the AUROCs for predicting high-intensity and low-intensity treatments are 0.76 and 0.66, respectively, both lower than those corresponding to predictions from dense neural network techniques (with...). Figures 7A to 7B (Related to the data in the middle).
[0111] Table 1 summarizes the end-to-end performance comparisons using different metrics and forecasting techniques.
[0112]
[0113]
[0114] Table 1
[0115] VI.C. Conclusion
[0116] When evaluating predictions of post-loaded anti-VEGF therapy using validation or retained data (in terms of predicting the frequency of treatment administration based on baseline OCT image processing), the deep learning model continued to demonstrate strong performance. Even when using different types of performance metrics, the performance remained robust. Furthermore, the performance surpassed the outputs of other techniques used for predicting anti-VEGF therapy.
[0117] VII. Additional Considerations
[0118] Some embodiments of this disclosure include a system comprising one or more data processors. In some embodiments, the system includes a non-transitory computer-readable storage medium containing instructions that, when executed on the one or more data processors, cause the one or more data processors to perform part or all of one or more methods and / or part or all of one or more processes disclosed herein. Some embodiments of this disclosure include a computer program product tangibly embodied in a non-transitory machine-readable storage medium, comprising instructions configured to cause one or more data processors to perform part or all of one or more methods and / or part or all of one or more processes disclosed herein.
[0119] The terms and expressions used are descriptive rather than restrictive, and their use is not intended to exclude any equivalents of the features or portions thereof shown and described, but it should be recognized that various modifications can be made within the scope of the claimed invention. Therefore, it should be understood that while the claimed invention has been specifically disclosed by way of examples and optional features, modifications and variations of the concepts disclosed herein can be made by those skilled in the art, and such modifications and variations are considered to be within the scope of the invention as defined in the appended claims.
[0120] This description provides only preferred exemplary embodiments and is not intended to limit the scope, applicability, or configuration of this disclosure. Rather, this description of preferred exemplary embodiments will provide those skilled in the art with a feasible description for implementing various embodiments. It should be understood that various changes can be made to the function and arrangement of the elements without departing from the spirit and scope set forth in the appended claims.
[0121] Specific details are set forth in this description to provide a thorough understanding of the embodiments. However, it should be understood that the embodiments may be practiced without these specific details. For example, circuits, systems, networks, processes, and other components may be shown as parts in block diagram form to avoid obscuring the embodiments with unnecessary details. In other instances, well-known circuits, processes, algorithms, structures, and techniques may be shown without unnecessary details to avoid obscuring the embodiments.
Claims
1. A computer-implemented method, comprising: Access optical coherence tomography (OCT) images corresponding to the eyes of a subject with age-related macular degeneration; Identify a set of pixels corresponding to the retinal layer within the OCT image; The OCT image is planarized based on the aforementioned set of pixels; Multiple blocks are generated using a planarized OCT image, wherein generating the multiple blocks includes: Perform one or more cropping processes using the planarized OCT image to produce one or more cropped images; and Extract the plurality of blocks from the one or more cropped images; The plurality of blocks includes blocks having multiple sizes; The multiple blocks are input into multiple block-specific neural networks; Each block-specific neural network has been trained on training blocks of a specific size to predict effective features of the treatment plan; and Based on the size of each block, the plurality of blocks are input into a block-specific neural network; Multiple block-specific outputs are generated through the multiple block-specific neural networks; Each of the multiple block-specific outputs corresponds to a specific block among the multiple blocks; and Each of the plurality of block-specific outputs predicts the effective characteristics of a suggested treatment plan for the subject's eye; The block-specific outputs are weighted using a ensemble neural network that has already learned the weighting relationships; and The ensemble neural network generates labels corresponding to features of the suggested treatment plan for the subject's eyes based on weighted, block-specific outputs; and Output the aforementioned marker.
2. The computer implementation method of claim 1, wherein outputting the marker includes outputting the marker at or to a client device for administering anti-vascular endothelial growth factor (aVEGF) treatment to the subject's eye according to the recommended treatment plan; and The markings indicate the frequency of treatment application, which is predicted to be sufficiently effective to prevent fluid leakage from the blood vessels in the eye between consecutive treatment applications.
3. The computer-implemented method of claim 1, wherein the features of the proposed treatment plan indicate the interval between consecutive administrations of treatment.
4. The computer implementation method according to claim 1, wherein the retinal layer comprises the retinal pigment epithelium layer.
5. The computer implementation method of claim 1, wherein the marker identifies the subject's eye as a feature of the proposed treatment plan and predicts the eye to be effectively treated based on weighted, block-specific outputs generated by the ensemble neural network; The recommended treatment plan is characterized by the interval between consecutive administrations of anti-vascular endothelial growth factor (aVEGF) therapy; and The method further includes administering aVEGF treatment to the subject's eyes according to the recommended treatment plan.
6. The computer implementation method of claim 1, wherein the weighting relationship includes applying weights to the block-specific output based on a block-specific neural network that generates the block-specific output.
7. The computer implementation method of claim 1, wherein the weighting relationship includes applying weights to each block-specific output based on other block-specific outputs.
8. A system comprising: One or more data processors; as well as A non-transitory computer-readable storage medium containing instructions that, when executed on the one or more data processors, cause the one or more data processors to perform a set of actions including: Access optical coherence tomography (OCT) images corresponding to the eyes of a subject with age-related macular degeneration; Identify a set of pixels corresponding to the retinal layer within the OCT image; The OCT image is planarized based on the aforementioned set of pixels; Multiple blocks are generated using a planarized OCT image, wherein generating the multiple blocks includes: Perform one or more cropping processes using the planarized OCT image to produce one or more cropped images; and Extract the plurality of blocks from the one or more cropped images; The plurality of blocks includes blocks having multiple sizes; The multiple blocks are input into multiple block-specific neural networks; Each block-specific neural network has been trained on training blocks of a specific size to predict effective features of the treatment plan; and Based on the size of each block, the plurality of blocks are input into a block-specific neural network; Multiple block-specific outputs are generated through the multiple block-specific neural networks; Each of the multiple block-specific outputs corresponds to a specific block among the multiple blocks; and Each of the plurality of block-specific outputs predicts the effective characteristics of a suggested treatment plan for the subject's eye; The block-specific outputs are weighted using a ensemble neural network that has already learned the weighting relationships; and The ensemble neural network generates labels corresponding to features of the suggested treatment plan for the subject's eyes based on weighted, block-specific outputs; and Output the aforementioned marker.
9. The system of claim 8, wherein outputting the marker includes at or to a client device for administering anti-vascular endothelial growth factor (aVEGF) treatment to the subject's eye according to the recommended treatment plan; and The markings indicate the frequency of treatment application, which is predicted to be sufficiently effective to prevent fluid leakage from the blood vessels in the eye between consecutive treatment applications.
10. The system of claim 8, wherein the feature of the recommended treatment plan indicates the interval between consecutive administrations of treatment.
11. The system of claim 8, wherein the retinal layer comprises the retinal pigment epithelium.
12. The system of claim 8, wherein the recommended treatment plan includes a recommended plan for administering anti-vascular endothelial growth factor.
13. The system of claim 8, wherein the weighting relationship includes applying weights to the block-specific output based on a block-specific neural network that generates the block-specific output.
14. The system of claim 8, wherein the weighting relationship includes applying weights to each block-specific output based on the outputs of other blocks.
15. A computer program product tangibly embodied in a non-transitory machine-readable storage medium, comprising instructions configured to cause one or more data processors to perform a set of actions including: Access optical coherence tomography (OCT) images corresponding to the eyes of a subject with age-related macular degeneration; Identify a set of pixels corresponding to the retinal layer within the OCT image; The OCT image is planarized based on the aforementioned set of pixels; Multiple blocks are generated using a planarized OCT image, wherein generating the multiple blocks includes: Perform one or more cropping processes using the planarized OCT image to produce one or more cropped images; and Extract the plurality of blocks from the one or more cropped images; The plurality of blocks includes blocks having multiple sizes; The multiple blocks are input into multiple block-specific neural networks; Each block-specific neural network has been trained on training blocks of a specific size to predict effective features of the treatment plan; and Based on the size of each block, the plurality of blocks are input into a block-specific neural network; Multiple block-specific outputs are generated through the multiple block-specific neural networks; Each of the multiple block-specific outputs corresponds to a specific block among the multiple blocks; and Each of the plurality of block-specific outputs predicts the effective characteristics of a suggested treatment plan for the subject's eye; The block-specific outputs are weighted using a ensemble neural network that has already learned the weighting relationships; and The ensemble neural network generates labels corresponding to features of the suggested treatment plan for the subject's eyes based on weighted, block-specific outputs; and Output the aforementioned marker.
16. The computer program product of claim 15, wherein outputting the marker includes outputting the marker at or to a client device for administering anti-vascular endothelial growth factor (aVEGF) treatment to the subject's eye according to the recommended treatment plan; and The marking indicates the frequency of treatment administration or the interval between consecutive treatment administrations that is predicted to be sufficiently effective to prevent fluid leakage from the blood vessels in the eye between consecutive treatment administrations.
17. The computer program product of claim 15, wherein the weighting relationship includes applying weights to the block-specific output based on a block-specific neural network that generates the block-specific output.
18. The computer program product of claim 15, wherein the weighting relationship includes applying weights to each block-specific output based on other block-specific outputs.