Methods for assessing thoracic aortic aneurysm

A machine learning model utilizing aortic diameter, length, and non-size factors provides personalized risk assessments for thoracic aortic aneurysms, improving accuracy and reducing false positives in predicting adverse events.

WO2026122782A1PCT designated stage Publication Date: 2026-06-11YALE UNIVERSITY

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
YALE UNIVERSITY
Filing Date
2025-12-04
Publication Date
2026-06-11

AI Technical Summary

Technical Problem

Current methods for assessing thoracic aortic aneurysms lack accuracy and fail to consider personalized risk factors, leading to discrepancies in measurement and potential underestimation of rupture risk.

Method used

A machine learning model is trained using a dataset incorporating clinical, demographic, and genetic variables, including aortic diameter, length, and non-size factors to provide personalized risk assessments for aortic aneurysms.

🎯Benefits of technology

The model offers improved accuracy and specificity in predicting adverse aortic events, reducing false positives and maintaining high sensitivity, thereby enhancing patient outcomes through personalized treatment decisions.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure US2025058084_11062026_PF_FP_ABST
    Figure US2025058084_11062026_PF_FP_ABST
Patent Text Reader

Abstract

Provided herein is a method of training a machine learning model for assessing aortic risk. The method includes providing an initial dataset; forming a training dataset from the initial dataset; and training a machine learning model using the training dataset. Also provided herein are a method of assessing risk using the trained machine learning model, an apparatus for assessing aortic risk, and a computer readable storage medium including instructions for performing the method of training the machine learning model.
Need to check novelty before this filing date? Find Prior Art

Description

PCT / US25 / 58084 04 December 2025 (04.12.2025)Attorney Docket No. 047162-7541WOl(02761)METHODS FOR ASSESSING THORACIC AORTIC ANEURYSMCROSS-REFERENCE TO RELATED APPLICATIONS

[0001] The present application claims priority under 35 U.S.C. § 119(e) to U.S. Provisional Patent Application No. 63 / 728,140, filed December 4, 2024, which application is incorporated herein by reference in its entirety.BACKGROUND OF THE INVENTION

[0002] Thoracic aortic aneurysms are menacing dilatations of the main artery of the body. These aneurysms are usually silent until sudden, unexpected rupture occurs. Fully 8% of all sudden deaths are found to be due to aneurysm disease.

[0003] Because aneurysms usually produce no symptoms before rupturing, the current practice is to operate prophylactically when an aneurysm reaches 5 cm diameter. However, there are often discrepancies in measurement of the ascending aorta. Additionally, this single size manual assessment criterion likely ignores and / or fails to account for other important variables.

[0004] Accordingly, there remains a need in the art for articles and methods that improve upon existing articles and methods for assessing aneurysms by providing more accurate, personalized risk assessments to improve patient outcomes. The present disclosure meets this need.SUMMARY

[0005] In one aspect a method of training a machine learning model for assessing aortic risk includes providing an initial dataset; forming a training dataset from the initial dataset; and training a machine learning model using the training dataset.

[0006] In some embodiments, the initial dataset includes at least one of clinical, demographic, and genetic variables from a study population including patients who presented with non-traumatic aortic pathology. In some embodiments, the variables include at least one of demographic information, aortic length, aortic diameter, engineering characteristics, key nonsize variables, general data items, or combinations thereof. In some embodiments, the key nonsize variables include pain, length / tortuosity, genes, family history, bicuspid aortic valve,PCT / US25 / 58084 04 December 2025 (04.12.2025)Attorney Docket No. 047162-7541WOl(02761) diabetes, aortic stress, biomarkers, KIF6, root vs. asc, PET imaging, or combinations thereof. In some embodiments, the aortic diameter is measured from the aortic root in the axial plane. In some embodiments, the diameter of the aortic root in the axial plane is measured using Laplace technique. In some embodiments, the aortic diameter is measured from the aortic root in the coronal plane.

[0007] In some embodiments, forming the training dataset includes preprocessing the initial dataset. In some embodiments, the preprocessing includes removing outliers, processing temporal data, or a combination thereof.

[0008] In some embodiments, forming the training dataset includes feature engineering of the initial dataset. In some embodiments, the feature engineering includes pivoting measurements, creating a target variable, handling of missing data, or combinations thereof. In some embodiments, the pivoting measurements includes pivoting longitudinal measurements of aortic dimensions to create a dataset where each row represents a specific measurement event. In some embodiments, creating the target variable includes constructing a primary outcome variable to indicate the occurrence of any adverse event within a defined post-measurement timeframe. In some embodiments, the primary outcome variable is binary. In some embodiments, the handling of missing data includes imputing a large value placeholder for the missing data. In some embodiments, the large value placeholder is recognized as missing data during recursivepartitioning.

[0009] In some embodiments, forming the training dataset includes selecting features based on their importance. In some embodiments, the feature importance is determined by an initial random-forest model.

[0010] In some embodiments, the method further comprises hyperparameter refinement through systematic exploration of a hyperparameter space to minimize a mean squared error (MSE) on a validation set.

[0011] In some embodiments, the machine learning model is a random forest machine learning model. In some embodiments, the machine learning model is thresholded at 0.1.

[0012] In another aspect, a method of assessing aortic risk comprises providing the machine learning model trained according to any of the embodiments disclosed herein; inputting a subject’s information to the machine learning model; and receiving an output from the machine learning model; wherein the output from the machine learning model indicates aortic risk of thePCT / US25 / 58084 04 December 2025 (04.12.2025)Attorney Docket No. 047162-7541WOl(02761) subject. In some embodiments, the method further comprises treating the subject based upon the output from the model. In some embodiments, the treatment includes aortic repair surgery.

[0013] In another aspect, a computer readable storage medium comprising computerexecutable instructions for performing the method according to any of the embodiments disclosed herein.

[0014] In another aspect, an apparatus for assessing aortic risk comprises a processor; a memory unit; and a communication interface; wherein the processor is connected to the memory unit and the communication interface; and wherein the processor and memory are configured to implement the method according to any of the embodiments disclosed herein.BRIEF DESCRIPTION OF THE DRAWINGS

[0015] For a fuller understanding of the nature and desired objects of the present invention, reference is made to the following detailed description taken in conjunction with the accompanying drawing figures wherein like reference characters denote corresponding parts throughout the several views.

[0016] FIG. 1 shows a schematic illustrating genes associated with aortic risk.

[0017] FIGS. 2A-B show an image and graph relating to measurement of aortic length. (A) An image illustrating the aortic length measurement. (B) A graph illustrating hinge points in the aortic length measurements.

[0018] FIGS. 3A-E show images of aortic diameter measurement. (A) Aortic diameter measurement in the axial plane. (B) Aortic diameter measurement in the coronal plane. (C) Application of the extended law of Laplace to aortic root measurement. (D) Image showing an overlay of the Laplace technique for measuring the aortic root. (E) Comparison of Laplace (axial) diameter measurement to coronal diameter measurement.

[0019] FIG. 4 shows a graph illustrating confusion matrix for a model trained according to an embodiment of the disclosure.

[0020] FIG. 5 shows a graph illustrating the ROC curve for a model trained according to an embodiment of the disclosure.

[0021] FIG. 6 shows a graph illustrating confusion matrix for the 5cm rule.

[0022] FIG. 7 shows a graph illustrating the ROC curve for the 5cm rule.PCT / US25 / 58084 04 December 2025 (04.12.2025)Attorney Docket No. 047162-7541WOl(02761)DETAILED DESCRIPTIONDefinitions

[0023] As used herein, each of the following terms has the meaning associated with it in this section. Unless defined otherwise, all technical and scientific terms used herein generally have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Generally, the nomenclature used herein and the laboratory procedures in molecular biology, immunology, animal pharmacology, pharmaceutical science, peptide chemistry, and organic chemistry are those well-known and commonly employed in the art. It should be understood that the order of steps or order for performing certain actions is immaterial, so long as the present teachings remain operable. Any use of section headings is intended to aid reading of the document and is not to be interpreted as limiting; information that is relevant to a section heading may occur within or outside of that particular section. All publications, patents, and patent documents referred to in this document are incorporated by reference herein in their entirety, as though individually incorporated by reference.

[0024] In the application, where an element or component is said to be included in and / or selected from a list of recited elements or components, it should be understood that the element or component can be any one of the recited elements or components and can be selected from a group consisting of two or more of the recited elements or components.

[0025] In the methods described herein, the acts can be carried out in any order, except when a temporal or operational sequence is explicitly recited. Furthermore, specified acts can be carried out concurrently unless explicit claim language recites that they be carried out separately. For example, a claimed act of doing X and a claimed act of doing Y can be conducted simultaneously within a single operation, and the resulting process will fall within the literal scope of the claimed process.

[0026] As used herein, the singular form “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.

[0027] Unless specifically stated or obvious from context, as used herein, the term “about” is understood as within a range of normal tolerance in the art, for example within 2 standard deviations of the mean. “About” can be understood as within 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.5%, 0.1%, 0.05%, or 0.01% of the stated value. Unless otherwise clear from context, all numerical values provided herein are modified by the term about.PCT / US25 / 58084 04 December 2025 (04.12.2025)Attorney Docket No. 047162-7541WOl(02761)

[0028] As used herein, the terms “comprises,” “comprising,” “containing,” “having,” and the like can have the meaning ascribed to them in U.S. patent law and can mean “includes,” “including,” and the like.

[0029] Unless specifically stated or obvious from context, the term “or,” as used herein, is understood to be inclusive.

[0030] Ranges provided herein are understood to be shorthand for all of the values within the range. For example, a range of 1 to 50 is understood to include any number, combination of numbers, or sub-range from the group consisting 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 (as well as fractions thereof unless the context clearly dictates otherwise).

[0031] As used herein, the term “ratio” refers to a relationship between two numbers (e.g, scores, summations, and the like). Although, ratios can be expressed in a particular order (e.g., a to b or a:Z>), one of ordinary skill in the art will recognize that the underlying relationship between the numbers can be expressed in any order without losing the significance of the underlying relationship, although observation and correlation of trends based on the ratio may need to be reversed. For example, if the values of a over time are (4, 10) and the values of b over time are (2, 4), the ratio a b will equal (2, 2.5), while the ratio b:a will be (0.5, 0.4). Although the values of a and b are the same in both ratios, the ratios a.b and b a are inverse and increase and decrease, respectively, over the time period.Detailed Description

[0032] Provided herein are methods of training a machine learning model for assessing aortic risk. In some embodiments, the method includes providing an initial dataset, forming a training dataset from the initial dataset, and training a machine learning model using the training dataset. The machine learning model includes any suitable model for developing a predictive model, such as, but not limited to, a random-forest model.

[0033] The initial dataset includes any suitable dataset encompassing variables related to patients within an aortic pathology study population. For example, in some embodiments, the initial dataset includes clinical, demographic, and / or genetic variables (FIG. 1) from a study population including patients who presented with non -traumatic aortic pathology (e.g, aneurysm and dissection). In some embodiments, the initial dataset includes variable information fromPCT / US25 / 58084 04 December 2025 (04.12.2025)Attorney Docket No. 047162-7541WOl(02761) patients over multiple years, multiple visits, multiple tests, and / or long-duration follow-ups. The variables included in the initial dataset include, but are not limited to, demographic information, aortic length (FIGS. 2A-B), aortic diameter, engineering characteristics (including orthogonal cross-sectional surface area at all axial slices of the aorta and aortic-wall thickness and measures derived and calculated therefrom), key non-size variables, general data items (e.g., 120 additional patient / aneurysm characteristics gleaned from patient medical records), or combinations thereof. In some embodiments, the key non-size variables include pain, length / tortuosity, genes, family history, bicuspid aortic valve, diabetes, aortic stress (e.g., exercise, blood pressure), biomarkers (e.g., “RNA Signature” test), KIF6, root vs. asc, and / or PET imaging, as shown in Table 1.Table 1

[0034] In some embodiments, measuring the aortic diameter includes measuring the aortic root in the axial plane (FIG. 3A) or the coronal plane (FIG. 3B). In some embodiments, measuring the aortic root diameter in the axial plane includes using a Laplace technique (FIGS. 3C-D). As illustrated in FIGS. 3C-D, the Laplace technique includes measuring a Laplace radius, then determining a Laplace diameter as two times the Laplace radius. Referring to FIG.PCT / US25 / 58084 04 December 2025 (04.12.2025)Attorney Docket No. 047162-7541WOl(02761)3E, the Laplace radius (top panel) is measured as 24.3 x 2, which equals a Laplace diameter of 48.6 mm, while the coronal diameter was measured as 45.0 mm, indicating a difference of 3.6 mm.

[0035] The engineering characteristics can be obtained and / or calculated in any suitable manner. In some embodiments, for example, the engineering characteristics are previously measured and / or calculated and provided as a variable for the initial dataset. Additionally or alternatively, in some embodiments, the engineering characteristics are calculated directly from the pressure and geometry of the aorta. In some such embodiments, the pressure includes a blood pressure (e.g., systolic) measured from a patient and the geometry includes a three-dimensional (3D) reconstruction of the surface geometry of the aorta.

[0036] In some embodiments, where the engineering characteristics are calculated directly, the 3D reconstruction can be generated in any suitable manner, such as, but not limited to, from computed tomography (CT) images of a patient’s aorta. In some embodiments, a finite element (FE) model of the aorta is generated from the 3D reconstruction. For example, in some embodiments, 4-node quadrilateral shell elements are created first based on the surface geometry, with element size of about 2 mm by 2 mm, then four layers of 8-node linear brick elements (C3D8R) are created by offsetting the 4-node quadrilateral shell elements.

[0037] In some embodiments, forming the training dataset includes preprocessing the initial dataset. Preprocessing the initial dataset can include addressing common issues in clinical data, such as, but not limited to, missing values, outliers, and / or complex temporal relationships. For example, in some embodiments, the preprocessing includes identifying and removing outliers in key numeric variables. The outliers may be identified and removed based upon any suitable threshold, such as, but not limited to, a 5 -standard deviation threshold. In some embodiments, the preprocessing includes processing temporal data to ensure chronological consistency of events.

[0038] Additionally or alternatively, in some embodiments, forming the training dataset includes feature engineering and / or dimensionality reduction of the initial dataset. The feature engineering includes one or more of pivoting measurements, creating of target variable, and / or handling of missing data. The dimensionality reduction includes selecting features based on their importance and / or low correlation with other variables and / or dimensionality-reduction techniques such as principal components analysis (PCA), linear discriminant analysis (LDA),PCT / US25 / 58084 04 December 2025 (04.12.2025)Attorney Docket No. 047162-7541WOl(02761) non-negative matrix factorization (NMF), t-distributed stochastic neighbor embedding (t-SNE), uniform manifold approximation and projection (UMAP) and / or autoencoders . In some embodiments, the importance of the features is determined by an initial random-forest model. In some embodiments, features with zero importance are dropped to reduce dimensionality, enhancing the model's efficiency and interpretability.

[0039] In some embodiments, the pivoting measurements includes pivoting longitudinal measurements of aortic dimensions to create a dataset where each row represents a specific measurement event. This restructuring facilitates the analysis of changes in aortic size over time.

[0040] In some embodiments, the creation of a target variable includes constructing a primary outcome variable to indicate the occurrence of any adverse event within a specific time post-measurement. Adverse events include, but are not limited to, death, dissection, and / or rupture. Post-measurement timeframes include, but are not limited to, within 12 months, within 18 months, within 24 months, within 36 months, or any suitable combination, sub-combination, range, or sub-range thereof. In some embodiments, the primary outcome is binary, indicating whether or not the patient experienced an adverse event.

[0041] In some embodiments, the handling of missing data includes imputing with a placeholder. The handling of the missing data ensures that the machine learning algorithm can process the dataset without encountering errors. In some embodiments, a large valued (e.g., 100000000) is used as the placeholder, such as, but not limited to, when using recursivepartitioning methods, allowing missing data to be categorized and recognized distinctly, instead of making the assumptions of other types of imputing methods.

[0042] In some embodiments, forming the training dataset further includes splitting the preprocessed, feature engineered, and / or dimensionality reduced initial dataset into separate training and testing datasets. For example, in some embodiments, the initial dataset is split into training and testing sets using stratified sampling via the “train — test — split” function from Scikit-leam. This approach ensures that the distribution of the target variable is consistent across both sets, reducing the risk of sampling bias. The initial dataset may be split into the training and testing sets in any suitable proportion, such as, but not limited to, 60:40, 70:30, 80:20, or any combination, sub-combination, range, or sub-range thereof.

[0043] In some embodiments, training the machine learning model includes hyperparameter refinement. For example, the hyperparameter refinement may includePCT / US25 / 58084 04 December 2025 (04.12.2025)Attorney Docket No. 047162-7541WOl(02761) systematically exploring the hyperparameter space to minimize the mean squared error (MSE) on the validation set. In some embodiments, the hyperparameter refinement is conducted using Optuna library. In some embodiment, the hyperparameter refinement includes tuning parameters such as the number of trees (n estimators), tree depth (max depth), and / or the minimum number of samples required to split a node (min_samples_split). In some embodiments, the bestperforming model is retrained on the entire training set with the refined hyperparameters. In some embodiments, a range of models are developed and their hyperparameters tuned and the various model are assembled into an ensemble of models using the AutoGluon library.

[0044] The resulting model can be thresholded at any suitable level for providing the desired classification. For example, in some embodiments, predictions from the random-forest model are thresholded at 0.1 to classify patients as high-risk or low-risk for adverse events. In such embodiments, the low threshold skews towards recommending surgery, since false negatives are clinically more consequential than false positives.

[0045] Also provided herein are methods of assessing aortic risk using the machine learning model according to the embodiments disclosed herein. In some embodiments, the method of assessing aortic risk includes providing the machine learning model trained according to the present disclosure, inputting a subject’s information to the model, receiving an output from the model based upon the subject’s information, and determining an aortic risk based upon the model output. In some embodiments, the methods disclosed herein provide an automated, computerized tool for prediction of personalized risk in aortic aneurysm disease (e.g, natural risk of rupture or dissection of ascending thoracic aortic aneurysm (ATAA)), In some embodiments, the method further includes treating the subject based upon the output from the model. For example, in some embodiments, the method includes treating the subject when the model output classifies them as high-risk for adverse events (e.g., when the model output is above the threshold level). In some embodiments, the treatment includes surgery, such as, but not limited to, aortic repair surgery. For example, in some embodiments, the method includes treating the subject with aortic repair surgery when the model classifies the patient as high-risk with an output of greater than 0.1 (or any other suitable predetermined threshold level).

[0046] Without wishing to be bound by theory, it is believed that the machine learning model trained according to one or more of the embodiments disclosed herein provides significantly higher overall accuracy, specificity, and / or positive predictive value (PPV) withPCT / US25 / 58084 04 December 2025 (04.12.2025)Attorney Docket No. 047162-7541WOl(02761) respect to aortic risk as compared to the 5cm rule. Additionally, the machine learning model trained according to one or more of the embodiments disclosed herein provides higher Matthews correlation coefficient as compared to the 5cm rule, indicating better overall predictive performance. Accordingly the machine learning model trained according to one or more of the embodiments disclosed herein effectively balances sensitivity and specificity, reducing the number of false positives while maintaining a high sensitivity.

[0047] Further provided herein are a computer readable storage medium and an apparatus for assessing aortic risk. In some embodiments, the computer readable storage medium includes any suitable computer readable storage medium storing computer-executable instructions for performing the method according to any of the embodiments disclosed herein. In some embodiments, the apparatus for assessing aortic risk includes a processor, a memory unit, and a communication interface. In one embodiment, the processor is connected to the memory unit and the communication interface. In another embodiment, the processor and memory are configured to implement the method according to any one of embodiments disclosed herein. In some embodiments, the trained machine learning model is accessible through a website for data entry and risk prediction.

[0048] Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, numerous equivalents to the specific procedures, embodiments, claims, and examples described herein. Such equivalents were considered to be within the scope of this invention and covered by the claims appended hereto.

[0049] It is to be understood that wherever values and ranges are provided herein, all values and ranges encompassed by these values and ranges, are meant to be encompassed within the scope of the present invention. Moreover, all values that fall within these ranges, as well as the upper or lower limits of a range of values, are also contemplated by the present application.

[0050] The following examples further illustrate aspects of the present invention. However, they are in no way a limitation of the teachings or disclosure of the present invention as set forth herein.EXAMPLESEXAMPLE 1 - Artificial Intelligence Based Tool for Prediction of Natural Risk Posed byPCT / US25 / 58084 04 December 2025 (04.12.2025)Attorney Docket No. 047162-7541WOl(02761)Ascending Thoracic Aortic AneurysmsIntroduction

[0051] Current practice relies upon evidence-based recommendations for the timing of surgical aortic resection in connection with thoracic aortic aneurysm. The aim has been to recommend surgery before an aortic event occurs, but to avoid surgery until the risk of aneurysm rupture or dissection justify the (currently small) operative risk. Prior analysis showed that the existing size algorithm for ascending aortic intervention worked quite effectively in “real world” clinical application.

[0052] Initial recommendations, based on a “hinge point” in risk noted at 6.0 cm aortic diameter, suggested a general criterion of 5.5 cm for ascending aortic resection in reasonable risk patients. As more data accrued, two new hinge points were noted at 5.25 and 5.75 cm. Accordingly, a lower surgical criterion was recommended — at 5.0 cm (for reasonable risk patients). Both the United States and the European Cardiologic and Surgical Societies endorsed this “left-shift” of the standard surgical criterion (for good risk surgical patients). Concurrently, as experience with ascending aortic resection accumulated Nationwide and worldwide, safety of aortic surgery improved, making such early resection even more reasonable.

[0053] With time, the present inventors realized that aortic tissue grows concurrently in all directions, not just in width. So, the aorta grows longitudinally as well, not just in width (diameter). As the ascending aorta grows longitudinally over time, it takes a curve toward the right chest — as its top and bottom are fixed in location (at the heart and the aortic arch, respectively). With the realization of the occurrence and importance of aortic lengthening, two hinge points have been identified in aortic length, which predicted adverse events. On the basis of those observations, surgical resection was recommended at an ascending aortic length of 11 cm (measured along a centerline from the aortic annulus to the base of the innominate artery). In view thereof, the present inventors concluded that aortic length had previously been “neglected” as a criterion for ascending aortic intervention.

[0054] Initially, diameter and aortic length measurements were indexed to patient’ s body surface area (BSA). Subsequently, the present inventors found that height alone sufficed for indexing of aortic diameter or aortic length. Simply put, the aorta did not seem to care if excess body weight had been gained. Height alone sufficed very well for indexing aortic size, thus eliminating the need for BSA calculation.PCT / US25 / 58084 04 December 2025 (04.12.2025)Attorney Docket No. 047162-7541WOl(02761)

[0055] Concurrently with these morphologic analyses and consequent surgical recommendations, the present inventors also identified a set of eleven non-size (or, non- dimensional) variables that affect the risk of rupture of the ascending aorta. These, along with their direction of impact, are listed in Table 1. It was recommended that these important variables be incorporated into decision-making strategy — with all decisions tempered by “surgical judgment” for a specific patient.

[0056] In this Example, Al techniques are applied to the large, multifaceted database in order to develop an automated, computerized tool for prediction of personalized risk in aortic aneurysm disease. This tool incorporates (1) aortic length and aortic diameter natural history data, (2) the eleven non-size criteria previously identified, and (3) 120 additional patient / aneurysm characteristics gleaned from our aortic database and the patient medical records (EPIC format).

[0057] Herein, we present this Al tool for aortic risk assessment. This tool encompasses not only diameter and size but also all the non-size criteria, as well as numerous other generous patient variables gleaned from the medical record. The accuracy of this Al (armed with the data indicated) in predicting outcomes among aortic database patients is analyzed herein. Without wishing to be bound by theory, it is believed that the Al techniques presented herein improve aortic risk assessment above and beyond prior efforts, so as to provide more accurate modern decision making for individual patients and to maximize their chances for long-term survival in the face of ascending aortic aneurysm disease.MethodsStudy design and data source

[0058] This retrospective cohort study aimed to develop and validate a machine learning model to predict aortic dissection, rupture, or any-cause death (“adverse event”) within one year of a patient’s interaction with our Aortic Institute Team. The study utilized a de-identified dataset obtained from the Aortic Institute Database, encompassing a wide range of clinical, demographic, and genetic variables. Patients typically remain patients of the Aortic Institute for multiple years, allowing for multiple visits, multiple tests, and long-duration follow-up.PCT / US25 / 58084 04 December 2025 (04.12.2025)Attorney Docket No. 047162-7541WOl(02761)

[0059] The study population included patients who presented with non-traumatic aortic pathology — largely aneurysm and dissection. Data for this specific study were anonymized to ensure confidentiality.Data Preprocessing

[0060] The data was preprocessed to ensure that the dataset was suitable for machine learning and that the model could generalize well to new data. The preprocessing steps were designed to address common issues in clinical data, such as missing values, outliers, and complex temporal relationships.Outlier removal

[0061] Outliers in key numeric variables were identified and removed using a 5-standard- deviation threshold. This method is based on standard statistical practices to reduce the influence of extreme values that could bias the model (Hodge VJ, Austin J. A Survey of Outlier Detection Methodologies. Artif Intell Rev. 2004;22:85-126).Data handling

[0062] Temporal data were processed to ensure the chronological consistency of events. Custom functions were employed to remove records where surgery dates were less than one year after key reference dates, which provided accurate modeling of time-to-event outcomes.Feature engineeringPivoting measurements

[0063] Longitudinal measurements of aortic dimensions were pivoted to create a dataset where each row represented a specific measurement event. This restructuring facilitated the analysis of changes in aortic size over time.Creation of target variable

[0064] The primary outcome variable was constructed to indicate the occurrence of any adverse event (death, dissection, rupture) within 12 months post-measurement. This variable was binary, indicating whether or not the patient experienced an adverse event.Handling of missing data

[0065] Missing data were imputed using a placeholder (100000000), ensuring that the machine learning algorithms could process the dataset without encountering errors. This largePCT / US25 / 58084 04 December 2025 (04.12.2025)Attorney Docket No. 047162-7541WOl(02761) value is appropriate when using recursive-partitioning methods (such as were used in this modelling process) and allows missing data to be categorized and recognized distinctly, instead of making the assumptions of other types of imputing methods. That is, rather than ignoring the number or making an assumption, the model recognizes the value as being blank and treats the blank entry differently from a zero entry or some other entry of an imputed value.Dimensionality reduction

[0066] Features were selected based on their importance as determined by an initial random -forest model. Features with zero importance were dropped to reduce dimensionality, enhancing the model's efficiency and interpretability.Model development

[0067] The development of the predictive model involved the following steps.Data splitting

[0068] The dataset was split into training (70%) and testing (30%) sets using stratified sampling via the “train — test — split” function from Scikit-leam. This approach ensured that the distribution of the target variable was consistent across both sets, reducing the risk of sampling bias.Model training

[0069] A random-forest regressor was chosen for its robustness and ability to handle large feature sets. The model was trained on the training data, with feature importance calculated to identify the most predictive variables.Hyperparameter tuning using Optuna

[0070] Hyperparameter refinement was conducted using the Optuna library. Optuna systematically explored the hyperparameter space to minimize the mean squared error (MSE) on the validation set. This process included tuning parameters such as the number of trees (n_estimators), tree depth (max_depth), and the minimum number of samples required to split a node (min samples split).

[0071] The best-performing model, as identified by Optuna, was retrained on the entire training set with the refined hyperparameters.PCT / US25 / 58084 04 December 2025 (04.12.2025)Attorney Docket No. 047162-7541WOl(02761)Model Evaluation

[0072] Model evaluation was conducted using a comprehensive set of metrics, ensuring a thorough assessment of the model's predictive performance:Binary classification

[0073] Predictions from the random-forest model were thresholded at 0.1 to classify patients as high-risk or low-risk for adverse events. This threshold was chosen based on the distribution of predictions and the clinical relevance of the risk stratification.Performance metrics

[0074] The model's performance was assessed using accuracy, sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV). These metrics provide a detailed view of the model's strengths and weaknesses in different aspects of binary classification.ROC curve andAUC

[0075] The Receiver Operating Characteristic (ROC) curve was plotted to visualize the trade-off between sensitivity and specificity across different thresholds. The Area Under the Curve (AUC) was calculated, providing a single scalar value to summarize the model's overall performance.Comparison with clinical decision rule

[0076] The model's predictions were compared with the traditional 5cm rule for aortic diameter, a widely used clinical threshold for surgical intervention. The comparison involved calculating the Matthews correlation coefficient (MCC), accuracy, and Gini coefficient, offering insights into how the machine learning model performs relative to established clinical practice. This analysis and comparison are discussed in the Results section below.Ethical Considerations

[0077] The model’s predictions were critically evaluated for their potential impact on clinical decision-making, ensuring that they would support rather than replace human judgment.Results

[0078] A total of 5,006 patient records were analyzed to evaluate the performance of a machine-learning model in predicting adverse aortic events — defined as dissection, rupture, orPCT / US25 / 58084 04 December 2025 (04.12.2025)Attorney Docket No. 047162-7541WOl(02761) all-cause death within one year following a patient interaction. These records may include multiple visits from the same patient, as individuals often undergo several assessments over time. The machine-learning model's performance was assessed on a test subset of 1,502 records (being 30% of the total dataset), which also may include multiple visits from the same patients. To ensure a fair comparison, the current clinical guideline recommending aortic repair surgery for patients with an aortic diameter exceeding 5cm (the “5cm rule”) was evaluated using the same test set.Machine-learning model performance

[0079] The machine-learning model was trained and validated on the dataset, with its performance assessed on the test subset using a threshold probability of 0.1 for classifying patients as high risk. This threshold was deliberately chosen because type-II errors (false negatives) are clinically more consequential than type-I errors (false positives) in this context. Not performing surgery when a patient is at risk of an adverse event is often fatal (defined to be a “catastrophic error”), whereas performing surgery on a patient who might not have an adverse event within the next year carries substantially lower risk and may be necessary in the near future regardless. Therefore, the model is intentionally skewed towards recommending surgery by selecting a lower threshold.

[0080] The confusion matrix for the machine-learning model is presented in FIG. 4. From these outcomes, the following performance metrics were calculated:PCT / US25 / 58084 04 December 2025 (04.12.2025)Attorney Docket No. 047162-7541WOl(02761)

[0081] The Matthews correlation coefficient of 0.379 indicates a moderate positive relationship between the predicted and actual classifications, reflecting the model's reliability. The Gini coefficient of 0.754 further underscores the model's good discriminative ability.

[0082] The receiver operating characteristic (ROC) curve for the machine-learning model is shown in FIG. 5, with an area under the curve (AUC) of 0.877, indicating excellent discriminative ability.Performance of the 5-cm aortic diameter rule

[0083] The 5 cm rule was evaluated on the same test subset of 1,502 records. The confusion matrix is presented in FIG. 6. The equivalent calculated performance metrics are:

[0084] The Matthews correlation coefficient of 0.084 suggests a weak positive correlation between the predicted and actual classifications, indicating limited predictive power. The Gini coefficient of 0.154 reflects poor discriminative ability.

[0085] The ROC curve for the 5-cm rule is shown in FIG. 7, with an AUC of 0.58.Comparative analysis

[0086] Comparing the two methods, the machine-learning model demonstrated a significantly higher overall accuracy (81.76% vs. 38.75%) and specificity (81.98% vs. 35.54%) compared to the 5cm rule. Both methods had similar sensitivity (ML model: 78.90%, 5 cm rule: 79.82%), indicating they were equally effective in identifying patients at risk of adverse events. However, the machine-learning model had a substantially higher PPV (25.52% vs. 8.83%), meaning that when the machine-learning model predicted an adverse event, it was more likely to be correct. In addition, the machine-learning model had a higher Matthews correlationPCT / US25 / 58084 04 December 2025 (04.12.2025)Attorney Docket No. 047162-7541WOl(02761) coefficient (0.379 vs. 0.084), indicating better overall predictive performance. The higher Matthews correlation coefficient of the machine-learning model signifies a stronger correlation between predicted and actual outcomes, enhancing confidence in its predictions. The Gini coefficient of 0.754 for the ML model, compared to 0.154 for the 5 cm rule, further confirms its superior discriminative power.

[0087] The NPV was slightly higher for the machine-learning model (98.03% vs. 95.74%), suggesting it was more reliable in predicting patients who would not experience adverse events.

[0088] The 5cm rule's low specificity and PPV indicate a high number of false positives, which could lead to unnecessary surgeries and increased healthcare costs. In contrast, the machine-learning model effectively balanced sensitivity and specificity, reducing the number of false positives while maintaining a high sensitivity. This balance is crucial in clinical practice to avoid overtreatment and associated risks.

[0089] By selecting a threshold probability of 0.1, the ML model prioritized the detection of at-risk patients, acknowledging that false negatives are more detrimental than false positives in this clinical scenario. The higher specificity and PPV of the ML model imply that it is more reliable in predicting patients who will experience adverse events, potentially improving patient outcomes through timely interventions.Summary

[0090] The machine-learning model demonstrated superior performance over the current clinical guideline in predicting adverse aortic events within one year. With significantly higher accuracy, specificity, positive predictive value, Matthews correlation coefficient, and Gini coefficient, the ML model may serve as a more effective tool for guiding clinical decisions regarding aortic repair surgery. Its implementation could potentially lead to earlier identification of at-risk patients while reducing unnecessary surgeries resulting from false positives.

[0091] These findings align with other studies that have highlighted the limitations of using a fixed aortic diameter threshold for surgical intervention. For example, Davies et al. demonstrated that the rate of aortic expansion and other patient-specific factors play crucial roles in predicting aortic dissection and rupture. Additionally, the incorporation of machine-learning models in non-aortic cardiovascular risk prediction has shown promise in improving patient outcomes by integrating multiple variables and risk factors.PCT / US25 / 58084 04 December 2025 (04.12.2025)Attorney Docket No. 047162-7541WOl(02761)Discussion

[0092] The work reported herein takes our prediction models for adverse ascending aortic events into a new modality — by applying Artificial Intelligence to the information in our extensive Thoracic Aortic Database. Patient data (including demographic information, aortic diameter, aortic length, the 11 key non-size variables, and multiple general data items) is entered into the computational model. Note that Calculator has patient weight and height among available data, so the program can “correct” for body size automatically, if it wishes. The Calculator is armed by decades of intense, verified size and clinical data accrued at our Yale Aortic Institute. The Calculator outputs the projected risk of adverse events (aortic dissection, rupture, or death) for the coming year.

[0093] Our analysis on a large group of patients (1502) with complete data indicates very effective prediction of adverse events, with an area under the ROC curve of 0.877. The False Negative rate is very low — only 1.6%-indicating that the Calculator misses very few patients who will suffer an adverse event. This is extremely important in saving lives. Comparison with prediction based only on aortic size (specifically, aortic diameter greater than 5.0 cm) reveals a predictive capability for the Calculator far exceeding the performance of a purely size-based prediction on the identical data set, which produced an area under the ROC curve of only 0.58.

[0094] The Aortic Risk Calculator includes all forms of death (from any cause) in its outcome, so the area under the ROC curve of 0.877 is especially impressive, as the Calculator could not be expected to predict non-aortic deaths, which still contribute to the statistical calculations of effectiveness of the Calculator. Our prior experience has shown that reliable determination of cause of death is difficult, even with diligent accrual of Death Certificates. It is hard to separate aortic deaths from non-aortic deaths, often despite the most diligent “after the fact” investigations.

[0095] This analysis suggests that the Aortic Risk Calculator developed herein may well enhance predictive ability and refine patient selection for surgery above and beyond accuracy and precision previously available.EQUIVALENTS

[0096] Although preferred embodiments of the invention have been described using specific terms, such description is for illustrative purposes only, and it is to be understood thatPCT / US25 / 58084 04 December 2025 (04.12.2025)Attorney Docket No. 047162-7541WOl(02761) changes and variations may be made without departing from the spirit or scope of the following claims.INCORPORATION BY REFERENCE

[0097] The entire contents of all patents, published patent applications, and other references cited herein are hereby expressly incorporated herein in their entireties by reference.

Claims

Attorney Docket No. 047162-7541 WO 1(02761)CLAIMSWhat is claimed is:

1. A method of training a machine learning model for assessing aortic risk, the method comprising: providing an initial dataset; forming a training dataset from the initial dataset; and training a machine learning model using the training dataset.

2. The method of claim 1, wherein the initial dataset includes at least one of clinical, demographic, and genetic variables from a study population including patients who presented with non -traumatic aortic pathology.

3. The method of claim 2, wherein the variables include at least one of demographic information, aortic length, aortic diameter, engineering characteristics, key non-size variables, general data items, or combinations thereof.

4. The method of claim 3, wherein the key non-size variables include pain, length / tortuosity, genes, family history, bicuspid aortic valve, diabetes, aortic stress, biomarkers, KIF6, root vs. asc, PET imaging, or combinations thereof.

5. The method of claim 3, wherein the aortic diameter is measured from the aortic root in the axial plane.

6. The method of claim 5, wherein the diameter of the aortic root in the axial plane is measured using Laplace technique.

7. The method of claim 3, wherein the aortic diameter is measured from the aortic root in the coronal plane.

8. The method of claim 1 , wherein forming the training dataset includes preprocessing theAttorney Docket No. 047162-7541 WO 1(02761) initial dataset.

9. The method of claim 8, wherein the preprocessing includes removing outliers, processing temporal data, or a combination thereof.

10. The method of claim 1, wherein forming the training dataset includes feature engineering of the initial dataset.

11. The method of claim 10, wherein the feature engineering includes pivoting measurements, creating a target variable, handling of missing data, or combinations thereof.

12. The method of claim 11, wherein the pivoting measurements includes pivoting longitudinal measurements of aortic dimensions to create a dataset where each row represents a specific measurement event.

13. The method of claim 11, wherein creating the target variable includes constructing a primary outcome variable to indicate the occurrence of any adverse event within a defined postmeasurement timeframe.

14. The method of claim 13, wherein the primary outcome variable is binary.

15. The method of claim 11, wherein the handling of missing data includes imputing a large value placeholder for the missing data.

16. The method of claim 15, wherein the large value placeholder is recognized as missing data during recursive-partitioning.

17. The method of claim 1, wherein forming the training dataset includes selecting features based on their importance.

18. The method of claim 17, wherein the feature importance is determined by an initial random -forest model.Attorney Docket No. 047162-7541 WO 1(02761)19. The method of claim 1 , further comprising hyperparameter refinement through systematic exploration of a hyperparameter space to minimize a mean squared error (MSE) on a validation set.

20. The method of claim 1, wherein the machine learning model is a random forest machine learning model.

21. The method of claim 1, wherein the machine learning model is thresholded at 0.1.

22. A method of assessing aortic risk, the method comprising: providing the machine learning model trained according to any one of claims 1-21; inputting a subject’s information to the machine learning model; and receiving an output from the machine learning model; wherein the output from the machine learning model indicates aortic risk of the subject.

23. The method of claim 22, further comprising treating the subject based upon the output from the model.

24. The method of claim 23, wherein the treatment includes aortic repair surgery.

25. A computer readable storage medium comprising computer-executable instructions for performing the method according to any one of claims 1-21.

26. An apparatus for assessing aortic risk comprising: a processor; a memory unit; and a communication interface; wherein the processor is connected to the memory unit and the communication interface; and wherein the processor and memory are configured to implement the method according to any one of claims 1-21.