Prognosis prediction for st-segment-elevation myocardial infarction (STEMI) patients
Machine learning models trained on diverse clinical data improve the prediction of STEMI patient complications, enhancing accuracy and guiding clinical decisions.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- F HOFFMANN LA ROCHE & CO AG
- Filing Date
- 2025-12-18
- Publication Date
- 2026-06-25
AI Technical Summary
Current methods for predicting post-treatment complications in ST-segment-elevation myocardial infarction (STEMI) patients, such as in-hospital mortality and readmission risk, lack accuracy and are fragmented, with existing machine learning models having reproducibility and accessibility issues.
Developed machine learning models that utilize a set of features from demographics, lab results, and clinical data to predict in-hospital mortality, 6-month mortality, and 30-day readmission risk, trained on large patient populations across multiple clinical sites to improve accuracy and generalizability.
The models provide enhanced accuracy and discriminatory performance for patient risk categorization, guiding clinicians in making informed decisions and improving clinical workflow efficiency.
Smart Images

Figure EP2025088216_25062026_PF_FP_ABST
Abstract
Description
[0001] Prognosis prediction for ST-segment-elevation myocardial infarction (STEMI) patients
[0002] Technical field of disclosure
[0003] The present disclosure relates to methods for providing prognostic predictions for patients diagnosed with ST-segment-elevation myocardial infarction (STEMI) patients, specifically in relation to the risk of posttreatment complications. The disclosure describes in particular the use of machine learning models to predict the risk of in-hospital mortality, mortality within a predetermined period after STEMI diagnosis (e.g. 6-month mortality), and / or readmission (e.g. 30-day readmission) at hospital after discharge of the STEMI patient following treatment.
[0004] Background
[0005] ST-segment-elevation myocardial infarction (STEMI) is a severe form of heart attack that occurs when a coronary artery is completely blocked, leading to significant damage to the heart muscle. This blockage typically results from the rupture of an atherosclerotic plaque and the subsequent formation of a blood clot. STEMI is identified by distinct alterations on an electrocardiogram (ECG), notably the elevation of ST segments, which indicate acute myocardial injury. Immediate medical intervention, specifically percutaneous coronary intervention (PCI), is crucial to restore blood flow, minimize heart damage, and improve the survival rates.
[0006] Despite advancements in the PCI and the immediate benefits of the revascularization, further clinician interventions may be still necessary. Understanding whether a patient has a low or high risk for posttreatment complications is essential for improving the patient's clinical outcome and enabling hospitals to allocate their resources more efficiently. This knowledge would allow healthcare providers to weigh the potential risks against potential benefits of a more aggressive monitoring and / or treatment, and consequently to select a treatment plan suitable for the patient's risk level. Risk stratification of STEMI patients post-PCI remains a significant clinical challenge. The challenges arise due to variability in patient outcomes, complexity of the risk factors, and the need for an early and accurate prediction.
[0007] The current standard of care clinical scores for patient risk assessment in this context are the Global Registry of Acute Coronary Events (GRACE) Score (2.0) (Fox et al. 2014) and the Length of Stay, Acuity of admission, Comorbidities, and Emergency department visits (LACE) index (Walraven et al. 2010). The GRACE Score (2.0) is used to predict the risk of short-term and long-term mortality in patients who present with acute coronary syndromes (ACS). The LACE index is used to predict the risk of death or unplanned readmission within 30 days after patients discharge from the hospital.
[0008] Models based on machine learning (ML) models have been proposed for assessing risk for posttreatment mortality or hospital readmission for patients suffering from myocardial infarction. For example, Deng, et al. 2022 explored the use of random forest (RF), support vector machine (SVMs) and neural network algorithm (NNAs), with 37 input features for predicting no-reflow (NR) process and in-hospital death for post-PCI STEMI patients. Gupta et al. 2020 described the use of logistic regression (LR), SVMs, RF, gradient boosting (GB) and deep neural networks (DNN), with 192 input features to predict readmission within 30 days and 1 year after discharge for patients hospitalized with acute myocardial infarction (AMI).
[0009] The standard of care methods still have prediction accuracies and discriminatory power that can be further improved. Further, no machine learning based models for assessing patients’ risk for an adverse event have been adopted for use in practice so far, and all still have reproducibility, accuracy and accessibility problems. Furthermore, the current risk calculation scores are used separately for assessing the risk of the different adverse events, resulting in a fragmented risk score assessment landscape.
[0010] Summary of the Disclosure
[0011] The present inventors recognized that there was still a need for improved methods and systems for predicting risk for post-treatment complications of STEMI patients, specifically in-hospital mortality, 6-month mortality and 30-day readmission risk. Moreover, they postulated that a more unified approach for predicting the different adverse events would improve the clinical workflow and the patient's journey. Therefore, the inventors have developed methods for risk stratification of STEMI patients for post-treatment complications (Fig. 1 A), using a set of machine learning models to predict the risk of in-hospital mortality after treatment of STEMI, 6-month mortality after treatment of STEMI, and / or the risk of 30-day readmission at hospital after discharge of the STEMI patient. By utilizing a multitude of information from demographics, lab results, vitals, comorbidities, and other relevant data from real-world sources, these models aim to predict shortterm and long-term outcomes, guiding clinicians in making informed decisions about patient care throughout the patient's journey (Fig. 1 B).
[0012] To improve the accuracy, generalizability, and practical applicability of the risk stratification system in diverse clinical settings over the existing methods, the inventors have trained and tested a set of ML models using a large patient population data from multiple clinical sites (i.e. hospitals) and screened through a large volume of features for selection of final feature sets for each prediction. This has allowed the inventors to develop methods that show increased accuracy and improved discriminatory performance for patient risk categorization, compared to clinically established methods.
[0013] Thus, according to a first aspect, there is provided a computer-implemented method of determining a prognosis for a patient who has been admitted to hospital and treated for STEMI, the method comprising: (i) receiving the values of a plurality of predetermined features associated with the patient, the predetermined features comprising: one or more patient demographic features, one or more hospital admission history features, one or more clinical history features, one or more vital signs features and / or one or more laboratory tests features; and (ii) predicting, using the values of said plurality of features, a prognosis for the patient, wherein said predicting comprises using one or more machine learning models to predict a risk of the patient experiencing one or more respective post-treatment complications. The one or more respective post-treatment complications are selected from: readmission within a first predetermined period of time, in-hospital mortality, and mortality within a second predetermined period of time. Each of said one or more machine learning models has been trained to predict the risk of one of said post-treatment complications using training data comprising, for each of a plurality of patients who have been treated for STEMI: (i) the values of a predetermined set of said plurality of features and (ii) an indication of whether the patient has suffered from the post-treatment complication.
[0014] The method may have any one or more of the following optional features.
[0015] Receiving the values of the plurality of predetermined features may comprise sending a query to an Electronic Medical Records (EMR)system and receiving the values from said EMR system. The method may comprise receiving a selection of one or more of the post-treatment complications and selecting a machine learning model from a set of machine learning models to predict each selected post-treatment complication, the set of machine learning models comprising a machine learning model trained to predict readmission within the first predetermined period of time, a machine learning model trained to predict in- hospital mortality, and a machine learning model trained to predict mortality within the second predetermined period of time.
[0016] Predicting, using the values of said plurality of features, a prognosis for the patient may comprise using a machine learning model that has been trained to predict the risk of in-hospital mortality using training data comprising for each of a plurality of patients who have been treated for STEMI: (i) the values of a first set of features and (ii) an indication of whether the patient has suffered from in-hospital mortality after treatment for STEMI. The plurality of predetermined features may comprise the first set of features comprising: (i) a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, and a feature indicative of whether the patient has suffered cardiogenic shock; and (ii) a plurality of laboratory test features comprising: a feature indicative of the patient’s white blood cell count, and optionally a feature indicative of the patient’s serum albumin level and / or one or more demographic features comprising a feature indicative of the patient’s age.
[0017] Thus, also described herein is a computer-implemented method of determining a risk of in-hospital mortality for a patient who has been admitted to hospital and treated for STEMI, the method comprising: (i) receiving the values of a plurality of predetermined features associated with the patient, the predetermined features comprising a first set of features comprising: (a) a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, and a feature indicative of whether the patient has suffered cardiogenic shock; and (b) a plurality of laboratory test features comprising: a feature indicative of the patient’s white blood cell count, and optionally a feature indicative of the patient’s serum albumin level and / or one or more demographic features comprising a feature indicative of the patient’s age; and (ii) predicting, using the values of said plurality of features, a risk of in-hospital mortality for the patient, using a machine learning model that has been trained to predict the risk of in-hospital mortality using training data comprising for each of a plurality of patients who have been treated for STEMI: (i) the values of a first set of features and (ii) an indication of whether the patient has suffered from in- hospital mortality after treatment for STEMI.
[0018] In embodiments, the first set of features comprises: one or more clinical history features selected from: a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered atrial arrhythmia, a feature indicative of whether the patient has suffered cerebrovascular disease. In embodiments, the first set of features comprises: one or more demographic features comprising a feature indicative of the patient’s age. In embodiments, the first set of features comprises: one or more laboratory test features selected from: a feature indicative of the patient’s serum creatinine level, a feature indicative of the patient’s blood urea nitrogen (BUN) level, a feature indicative of patient’s platelet count, and a feature indicative of the patient’s serum albumin level. In embodiments, the first set of features comprises one or more vital signs features selected from: a feature indicative of the patient's systolic blood pressure, and a feature indicative of the patient's oxygen saturation level.
[0019] In embodiments, the first set of features comprises a plurality of clinical history features including a feature indicative of whether the patient has suffered cardiac arrest, a feature indicative of whether the patient has suffered cardiogenic shock, and at least one further feature selected from: a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered atrial arrhythmia, a feature indicative of whether the patient has suffered cerebrovascular disease.
[0020] In embodiments, the first set of features comprises a plurality of clinical history features including a feature indicative of whether the patient has suffered cardiac arrest and a feature indicative of whether the patient has suffered cardiogenic shock, and a plurality of laboratory test features including a feature indicative of the patient’s white blood cell count and a feature indicative of the patient’s platelet count, and one or more demographic features comprising a feature indicative of the patient’s age.
[0021] In embodiments, the first set of features comprises a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, and a feature indicative of whether the patient has suffered cardiogenic shock, and one or more demographic features comprising a feature indicative of the patient’s age.
[0022] In embodiments, the first set of features comprises a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, and a feature indicative of whether the patient has suffered cardiogenic shock, one or more demographic features comprising a feature indicative of the patient’s age, and one or more laboratory test features including a feature indicative of the patient’s platelet count or a feature indicative of the patient’s serum albumin level. In embodiments, the first set of features comprises a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, a feature indicative of whether the patient has suffered cardiogenic shock, and a feature indicative of whether the patient has suffered congestive heart failure, one or more demographic features comprising a feature indicative of the patient’s age, and one or more laboratory test features including a feature indicative of the patient’s serum albumin level.
[0023] In embodiments, the first set of features comprises a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, a feature indicative of whether the patient has suffered cardiogenic shock, and a feature indicative of whether the patient has suffered congestive heart failure, one or more demographic features comprising a feature indicative of the patient’s age, and one or more laboratory test features including a feature indicative of the patient’s serum albumin level.
[0024] In embodiments, the first set of features comprises at least 5 of the following 7 features: a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, a feature indicative of whether the patient has suffered cardiogenic shock, a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered cerebrovascular disease, a plurality of laboratory test features comprising: a feature indicative of the patient's white blood cell count, a feature indicative of the patient's serum albumin level, and one or more demographic features comprising a feature indicative of the patient's age; wherein the at least 5 features include at least one of a feature indicative of the patient's white blood cell count, and a feature indicative of the patient's serum albumin level.
[0025] In embodiments, the first set of features comprises: a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, and a feature indicative of whether the patient has suffered cardiogenic shock, a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered atrial arrhythmia, a feature indicative of whether the patient has suffered cerebrovascular disease; a plurality of laboratory test features comprising: a feature indicative of the patient’s white blood cell count, and a feature indicative of the patient’s serum albumin level a feature indicative of the patient’s serum creatinine level, a feature indicative of the patient’s blood urea nitrogen (BUN) level, a feature indicative of patient’s platelet count; a plurality of vital signs features comprising: a feature indicative of the patient's systolic blood pressure, and a feature indicative of the patient's oxygen saturation level; and one or more demographic features comprising a feature indicative of the patient’s age.
[0026] In embodiments, the first set of features comprises: a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, a feature indicative of whether the patient has suffered cardiogenic shock, a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered cerebrovascular disease, a plurality of laboratory test features comprising: a feature indicative of the patient's white blood cell count, a feature indicative of the patient's serum albumin level, and one or more demographic features comprising a feature indicative of the patient's age.
[0027] In embodiments, predicting, using the values of said plurality of features, a prognosis for the patient comprises using a machine learning model that has been trained to predict the risk of mortality within the second predetermined period after treatment of STEMI using training data comprising for each of a plurality of patients who have been treated for STEMI: (i) the values of a second plurality of features and (ii) an indication of whether the patient has suffered from mortality within the predetermined period of time after treatment of STEMI. The second set of features comprises: (i) a plurality of clinical history features comprising a feature indicative of whether the patient has suffered congestive heart failure, and a feature indicative of whether the patient has suffered cardiac arrest; (ii) one or more demographic features comprising a feature indicative of the patient’s age, and (iii) one or more laboratory test features comprising a feature indicative of the patient’s serum albumin level.
[0028] Thus, also described herein is a computer-implemented method of determining a risk of mortality within the second predetermined period after treatment of STEMI for a patient who has been admitted to hospital and treated for STEMI, the method comprising: (i) receiving the values of a plurality of predetermined features associated with the patient, the predetermined features comprising a second set of features comprising: (a) a plurality of clinical history features comprising a feature indicative of whether the patient has suffered cardiac arrest and one or both of a feature indicative of whether the patient has suffered congestive heart failure and a feature indicative of whether the patient has suffered a cardiogenic shock; and optionally (b) one or more demographic features comprising a feature indicative of the patient’s age, and / or (c) one or more laboratory test features comprising a feature indicative of the patient’s serum albumin level; and (ii) predicting, using the values of said plurality of features, a risk of mortality within the second predetermined period after treatment of STEMI for the patient, using a machine learning model that has been trained to predict the risk of mortality within the second predetermined period after treatment of STEMI using training data comprising for each of a plurality of patients who have been treated for STEMI: (i) the values of the second set of features and (ii) an indication of whether the patient has suffered from mortality within the second predetermined period after treatment of STEMI.
[0029] In embodiments, the second plurality of features comprises: one or more of clinical history features selected from: a feature indicative of whether the patient has suffered cerebrovascular disease, a feature indicative of whether the patient has suffered rales and / or jugular venous distension (JVD), and a feature indicative of whether the patient has suffered atrial arrhythmia. In embodiments, the second plurality of features comprises: one or more of laboratory test features selected from: a feature indicative of the patient’s blood urea nitrogen (BUN) level, a feature indicative of the patient’s white blood cell count, a feature indicative of the patient’s serums creatinine level, a feature indicative of the patient’s platelet count, a feature indicative of patient hemoglobin level. In embodiments, the second plurality of features comprises: one or more vital signs features comprising a feature indicative of the patient's systolic blood pressure.
[0030] In embodiments, the second set of features comprises a plurality of clinical history features including a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered cardiac arrest, and one or more further clinical history features selected from: a feature indicative of whether the patient has suffered cerebrovascular disease, a feature indicative of whether the patient has suffered rales and / or jugular venous distension (JVD), a feature indicative of whether the patient has suffered atrial arrhythmia, and a feature indicative of whether the patient has suffered cardiogenic shock.
[0031] In embodiments, the second plurality of features comprises: a plurality of clinical history features including a feature indicative of whether the patient has suffered congestive heart failure and a feature indicative of whether the patient has suffered cardiac arrest, and one or more laboratory test features selected from: a feature indicative of the patient’s white blood cell count, a feature indicative of the patient’s platelet count, and a feature indicative of patient hemoglobin level.
[0032] In embodiments, the second plurality of features comprises: a plurality of clinical history features comprising a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered cardiac arrest a feature indicative of whether the patient has suffered cerebrovascular disease, a feature indicative of whether the patient has suffered rales and / or jugular venous distension (JVD), a feature indicative of whether the patient has suffered atrial arrhythmia, and a feature indicative of whether the patient has suffered cardiogenic shock; one or more demographic features comprising a feature indicative of the patient’s age; a plurality laboratory test features comprising a feature indicative of the patient’s serum albumin level, a feature indicative of the patient’s blood urea nitrogen (BUN) level, a feature indicative of the patient’s white blood cell count, a feature indicative of the patient’s serums creatinine level, a feature indicative of the patient’s platelet count, and a feature indicative of patient hemoglobin level; and one or more vital signs feature comprising a feature indicative of the patient's systolic blood pressure.
[0033] In embodiments, the second plurality of features comprises: at least 6 of the following 8 features: a plurality of clinical history features comprising a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered cardiac arrest, a feature indicative of whether the patient has suffered cerebrovascular disease, and a feature indicative of whether the patient has suffered cardiogenic shock; one or more demographic features comprising a feature indicative of the patient’s age; and a plurality laboratory test features comprising a feature indicative of the patient’s serum albumin level, a feature indicative of the patient’s blood urea nitrogen (BUN) level, and a feature indicative of the patient’s white blood cell count; wherein the at least 6 features include at least the clinical history features. In embodiments, the second plurality of features comprises: a plurality of clinical history features comprising a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered cardiac arrest, a feature indicative of whether the patient has suffered cerebrovascular disease, and a feature indicative of whether the patient has suffered cardiogenic shock; one or more demographic features comprising a feature indicative of the patient’s age; and a plurality laboratory test features comprising a feature indicative of the patient’s serum albumin level, a feature indicative of the patient’s blood urea nitrogen (BUN) level, and a feature indicative of the patient’s white blood cell count.ln embodiments, predicting, using the values of said plurality of features, a prognosis for the patient comprises using a machine learning model that has been trained to predict the risk of readmission at hospital within the first predetermined period of time after discharge of the STEMI patient using training data that comprises for each of a plurality of patients who have been treated for STEMI: (i) the values of a third plurality of features and (ii) an indication of whether the STEMI patient was readmitted at hospital within the predetermined period of time after discharge. The third set of features comprise: one or more hospital admission history features comprising a feature indicative of a number of emergency department visits made by the patient in a predetermined period of time prior to the predicting, and one or more laboratory test features comprising a feature indicative of the patient’s hemoglobin level.
[0034] Thus, also described herein is a computer-implemented method of determining a risk of readmission at hospital within a first predetermined period of time after discharge after treatment of STEMI for a patient who has been admitted to hospital and treated for STEMI, the method comprising: (i) receiving the values of a plurality of predetermined features associated with the patient, the predetermined features comprising a third set of features comprising: one or more hospital admission history features comprising a feature indicative of a number of emergency department visits made by the patient in a predetermined period of time prior to the predicting, and one or more laboratory test features comprising a feature indicative of the patient’s hemoglobin level and / or a feature indicative of the patient’s blood urea nitrogen (BUN) level; and (ii) predicting, using the values of said plurality of features, a risk of readmission at hospital within the first predetermined period of time after discharge after treatment of STEMI for the patient, using a machine learning model that has been trained to predict the risk of readmission at hospital within the first predetermined period of time after discharge after treatment of STEMI using training data comprising for each of a plurality of patients who have been treated for STEMI: (i) the values of the third set of features and (ii) an indication of whether the patient has suffered from readmission at hospital within the first predetermined period of time after discharge after treatment of STEMI.
[0035] In embodiments, the third set of features further comprises: one or more of a plurality of laboratory test features selected from: a feature indicative of the patient’s serum creatinine level, a feature indicative of the patient’s blood high density lipoprotein (HDL) level, a feature indicative of the patient’s blood sodium level, a feature indicative of the patient’s serum albumin level. In embodiments, the third set of features further comprises: one or more hospital admission history features comprising a feature indicative of the patient’s admission duration. In embodiments, the third set of features further comprises: a plurality of clinical history features comprising a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered renal disease, a feature indicative of whether the patient has suffered chronic pulmonary disease.
[0036] In embodiments, the third set of features comprise a hospital admission history features comprising a feature indicative of a number of emergency department visits made by the patient in a predetermined period of time prior to the predicting, and a plurality of laboratory test features including a feature indicative of the patient’s serum creatinine level, and a feature indicative of the patient’s blood urea nitrogen (BUN); optionally wherein the plurality of laboratory test features further include a feature indicative of the patient’s blood high density lipoprotein (HDL) level, a feature indicative of the patient’s blood sodium level, and a feature indicative of the patient’s serum albumin level.
[0037] In embodiments, the third set of features comprise: one or more hospital admission history feature comprising a feature indicative of a number of emergency department visits made by the patient in a predetermined period of time prior to the predicting; and a plurality of laboratory test features comprising a feature indicative of the patient’s hemoglobin level, a feature indicative of the patient’s serum creatinine level, a feature indicative of the patient’s blood urea nitrogen (BUN) level, a feature indicative of the patient’s blood high density lipoprotein (HDL) level, a feature indicative of the patient’s blood sodium level, and a feature indicative of the patient’s serum albumin level.
[0038] In embodiments, the third set of features comprise: a plurality of hospital admission history features comprising: a feature indicative of a number of emergency department visits made by the patient in a predetermined period of time prior to the predicting, and a feature indicative of the patient’s admission duration; and a plurality of laboratory test features comprising a feature indicative of the patient’s hemoglobin level and a feature indicative of the patient’s blood urea nitrogen (BUN) level; and a plurality of clinical history features comprising a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered renal disease, and a feature indicative of whether the patient has suffered chronic pulmonary disease.
[0039] In embodiments, the third set of features comprise: at least 5 of the following 7 features: a plurality of hospital admission history features comprising: a feature indicative of a number of emergency department visits made by the patient in a predetermined period of time prior to the predicting, and a feature indicative of the patient’s admission duration; and a plurality of laboratory test features comprising a feature indicative of the patient’s hemoglobin level and a feature indicative of the patient’s blood urea nitrogen (BUN) level; and a plurality of clinical history features comprising a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered renal disease, and a feature indicative of whether the patient has suffered chronic pulmonary disease; wherein the at least 5 features include at least the plurality of clinical history features.
[0040] In embodiments, the values of patient demographic features are associated with the time of the prediction. In embodiments, values of vital signs features and / or laboratory test features are associated with one or more measurements at a latest available date and / or within a predetermined period of time preceding an index date, or a summarized value derived from a plurality of said values. In embodiments, the index date for prediction of in-hospital mortality, and mortality within a second predetermined period of time is the latest date at which the patient has been diagnosed with STEMI prior to receiving treatment for STEMI, or the index date for prediction of readmission within a first predetermined period of time is the date at which the predicting is performed or the date of discharge of the patient. In embodiments, the second predetermined period of time is between 3 months and 12 months, between 3 months and 9 months, between 3 months and 6 months. In embodiments, the second predetermined period of time is 6 months. In embodiments, the first predetermined period of time is between 10 and 60 days, between 20 and 40 days, between 20 and 30 days, between 2 and 6 weeks, between 2 weeks and 2 months, or 30 days.
[0041] In embodiments, each of the one or more machine learning models is individually selected from: decision trees, regularized and / or gradient boosted decision trees, random forest models, logistic regression models. In embodiments, each of the one or more machine learning models is a non-linear model. In embodiments, each of the one or more machine learning models is an ensemble model. In embodiments, each of the one or more machine learning models is a tree-based model. In embodiments, each of the one or more machine learning models has been trained to classify patients between a plurality of categories comprising a first category associated with a high risk of a post-treatment complication that the machine learning model has been trained to predict, and one or more further categories associated with lower risks of the post-treatment complication that the machine learning model has been trained to predict. In embodiments, the one or more further categories comprise a second category associated with a low risk of the post-treatment complication and a third category associated with a medium risk of the post-treatment complication. In embodiments, each of the one or more machine learning models has been trained to provide as output a probability of the patient experiencing a post-treatment complication. In embodiments, each of the one or more machine learning models has been trained to provide as output a probabilistic score indicative of the risk of the patient experiencing a post-treatment complication. In embodiments, each of the one or more machine learning models has been trained to classify patients between a plurality of categories associated with different risks of experiencing a post-treatment complication, wherein the predicting comprises classifying the patient between a plurality of categories by comparing the output of the machine learning model to a first predetermined threshold, wherein the patient is classified in a first, high risk category of experiencing a post-treatment complication when the output of the machine learning model is above the first threshold, and / or by comparing the output of the machine learning model to a second predetermined threshold, wherein the patient is classified in a second, low risk category of experiencing a post-treatment complication when the output of the machine learning model is below the second threshold. In embodiments, the first and second predetermined thresholds are thresholds that have been identified such that patients in a reference cohort, optionally the training patients, classified in the first, and second categories represent predetermined proportions of the reference cohort. In embodiments, the predetermined proportions may be proportions that are classified as low, and high risk, respectively, using a known risk classification score for prediction of the post-treatment complication.
[0042] In embodiments, the methods further comprise outputting, to a user interface or computing device, a result of said predicting. In embodiments, the result includes a score output by the one or more machine learning models, or a classification between a plurality of categories associated with respective risks of experiencing the one or more post-treatment complications. In embodiments, the methods comprise selecting a treatment or monitoring plan in accordance with the results of said predicting. In embodiments, the predicting comprises classifying the patient between a plurality of categories associated with respective risks of the patient experiencing a post-treatment complication, and wherein the method comprises selecting a first treatment and / or monitoring plan when the patient is classified in a first, high risk category, and selecting a second treatment and / or monitoring plan when the patient is classified in a second, low risk category.
[0043] According to a second aspect, there is provided a computer-implemented method of obtaining a trained machine learning model for determination of a prognosis of a subject who has been admitted to hospital and treated for STEMI, and a method of providing a tool for predicting the risk of a post-treatment complication for a STEMI patient, the methods comprising: obtaining training data comprising, for each of a plurality of patients who have been treated for STEMI: (i) values of a plurality of predetermined features associated with the patient and (ii) corresponding indication of whether the patient has suffered from the post-treatment complication; and training a machine learning model to predict the risk of a STEMI patient suffering from the post-treatment complication using said training data, wherein the machine learning model is trained to take as input the values of said plurality of features and to produce as output an indication of risk of a STEMI patient suffering from the post-treatment complication; wherein the predetermined features comprise: one or more patient demographic features, one or more hospital admission history features, one or more clinical history features, one or more vital signs features and / or one or more laboratory tests, and wherein the one or more post-treatment complications are selected from: readmission within a first predetermined period of time, in-hospital mortality, and mortality within a second predetermined period of time.
[0044] The method may have any one or more of the following optional features. In embodiments, the methods comprise selecting, from a plurality of candidate features, the predetermined features for one or more of the machine learning models, said selecting comprising: determining the values of one or more feature importance metric, optionally a SHAP value, for each of said candidate features, ranking the set of candidate features according to the feature importance values; and identifying a subset of said candidate features using one or more model performance metrics including the Area Under the Receiver Operating Characteristic Curve (AUC) by iteratively including additional features of lower rank and evaluating said one or more model performance metrics, wherein the predetermined set of features are selected as features whose inclusion improves the AUC of the model.
[0045] The methods of the present aspect may have any of the features described in relation to the first aspect. For example, the plurality of predetermined features may comprise the first set of features, the second set of features, and / or the third set of features. Also described is a method according to the first aspect comprising training the one or more models using a method according to the second aspect.
[0046] Also described according to a third aspect is a method of treating a subject that has been admitted to hospital and treated for STEMI, the method comprising determining a prognosis for the subject using the method of any embodiment of the first aspect, and (i) treating the subject with a first therapy or treatment plan, when the subject is classified in the first class; and / or (ii) treating the subject with a second therapy or treatment plan, when the subject is not classified in the first class.
[0047] Also described according to a fourth aspect is a system comprising a processor and one or more computer readable media storing instructions that, when executed by the processor, cause the processor to implement the method of any embodiment of the first and / or second aspects.
[0048] Also described according to a fifth aspect is a computer readable medium or media storing instructions that, when executed by a processor, cause the processor to implement the method of any embodiment of the first and / or second aspects.
[0049] Also described according to a sixth aspect is computer program product comprising instructions that, when executed by a processor, cause the processor to implement the method of any embodiment of the first and / or second aspects.
[0050] The disclosure also encompasses embodiments including any combinations of features described in relation to any of the above aspects, unless such features are clearly incompatible.
[0051] Brief description of drawings
[0052] Fig. 1A shows schematically an advanced algorithm system hosted on an algorithm hosting platform, designed for seamless integration with Electronic Medical Records (EMR) systems.
[0053] Fig. 1B illustrates schematically a use of a risk stratification system along the STEMI patient journey as described herein. Fig. 2 is a flow diagram showing, in schematic form, a method of providing a risk stratification according to the disclosure.
[0054] Fig. 3 shows an embodiment of a system for implementing methods of the disclosure.
[0055] Fig. 4 illustrates schematically a model training flowchart used in the examples of the disclosure.
[0056] Fig. 5A illustrates the cohort composition of the training data used for training a model for prediction of in- hospital mortality in examples of the disclosure.
[0057] Fig. 5B illustrates the cohort composition of the training data used for training a model for prediction of 6- month mortality in examples of the disclosure.
[0058] Fig. 5C illustrates the cohort composition of the training data used for training a model for prediction of 3- day readmission in examples of the disclosure.
[0059] Fig. 6 illustrates schematically how index dates and lookback periods were determined in examples of the disclosure.
[0060] Fig. 7 illustrates a scheme used for partitioning of training data in the examples of the disclosure.
[0061] Fig. 8A illustrates the AUC curve of in-hospital mortality feature forward selection.
[0062] Fig. 8B illustrates the AUC curve of 6-month mortality feature forward selection.
[0063] Fig. 8C illustrates the AUC curve of 30-day readmission feature forward selection.
[0064] Fig. 9A shows SHAP values for predictive variables tested for the prediction of in-hospital mortality.
[0065] Fig. 9B shows SHAP values for predictive variables tested for the prediction of 6-months mortality.
[0066] Fig. 9C shows SHAP values for predictive variables tested for the prediction of 30-days readmission.
[0067] Fig. 10A illustrates schematically an exemplary method for identifying thresholds for classifying patients as low, medium or high risk using machine learning models of the disclosure.
[0068] Fig. 10B illustrates schematically an exemplary method for assessing thresholds for classifying patients as low, medium or high risk using machine learning models of the disclosure.
[0069] Fig. 11A shows results evaluating the performance of different in-hospital mortality prediction models of the disclosure and the benchmark clinical score GRACE 2.0.
[0070] Fig. 11B shows results evaluating the performance of different 6-months mortality prediction models of the disclosure and the benchmark clinical score GRACE 2.0.
[0071] Fig. 11C shows results evaluating the performance of different 30-days readmission prediction models of the disclosure and the benchmark clinical score LACE index. Fig. 12A shows a calibration curve for an in-hospital mortality prediction model of the disclosure (XG Boost model) and the benchmark clinical score GRACE 2.0 and corresponding Brier score (BS). The plots show the predicted vs true probability in each bin (top) and a box-and-whisker plot showing the distribution of predicted probabilities (bottom).
[0072] Fig. 12B shows a calibration curve for a 6-month mortality prediction model of the disclosure (XGBoost model) and the benchmark clinical score GRACE 2.0 (top) and corresponding Brier score (BS). The plots show the predicted vs true probability in each bin (top) and a box-and-whisker plot showing the distribution of predicted probabilities (bottom).
[0073] Fig. 12C shows a calibration curve for a 30-days readmission prediction model of the disclosure (CatBoost model) and the benchmark clinical score LACE index and corresponding Brier score (BS). The plots show the predicted vs true probability in each bin (top) and a box-and-whisker plot showing the distribution of predicted probabilities (bottom).
[0074] Fig. 13A, B show results of a feature ablation analysis (impact of missing features) for a rule in (Fig. 13A) and rule out (Fig. 13B) classification using an in-hospital mortality XGBoost prediction model of the disclosure.
[0075] Fig. 13C, D show results of a feature ablation analysis (impact of missing features) for a rule in (Fig. 13C) and rule out (Fig. 13D) classification using a 6-month mortality XGBoost prediction model of the disclosure. Fig. 13E, F show results of a feature ablation analysis (impact of missing features) for a rule in (Fig. 13E) and rule out (Fig. 13F) classification using the 30-days readmission CatBoost prediction model of the disclosure.
[0076] Fig. 14A illustrates the cohort composition of training data used in Example 2 for training a model for prediction of in-hospital mortality in examples of the disclosure.
[0077] Fig. 14B illustrates the cohort composition of the training data used in Example 2 for training a model for prediction of 6-month mortality in examples of the disclosure.
[0078] Fig. 14C illustrates the cohort composition of the training data used in Example 2 for training a model for prediction of 3-day readmission in examples of the disclosure.
[0079] Fig. 15 illustrates a scheme used for partitioning training data in Example 2 of the disclosure.
[0080] Fig. 16 illustrates schematically a model training flowchart used in Example 2 of the disclosure.
[0081] Fig. 17A shows results of the forward feature selection for prediction of in-hospital mortality described in Example 2. The plot shows the mean AUC over a plurality of cross-validation folds, the 95% confidence interval around the mean and -analytical overlays (Last Maximum AUC Mean, p-value curve, set of Power Curves, Plateau Point). Fig. 17B shows results of the forward feature selection for prediction of 6-month mortality described in Example 2. The plot shows the mean AUC over a plurality of cross-validation folds, the 95% confidence interval around the mean and analytical overlays (Last Maximum AUC Mean, p-value curve, set of Power Curves, Plateau Point).
[0082] Fig. 17C shows results of the forward feature selection for prediction of 30-day readmission described in Example 2. The plot shows the mean AUC over a plurality of cross-validation folds, the 95% confidence interval around the mean and analytical overlays (Last Maximum AUC Mean, p-value curve, set of Power Curves, Plateau Point).
[0083] Fig. 18A shows SHAP values of the final feature list of a trained LR model to predict the risk of in-hospital mortality described in Example 2.
[0084] Fig. 18B shows SHAP values of the final feature list of a trained LR model to predict the risk of 6-month mortality described in Example 2.
[0085] Fig. 18C shows SHAP values of the final feature list of a trained LR model to predict the risk of 30-day readmission described in Example 2.
[0086] Fig. 19A shows a calibration curve for an in-hospital mortality prediction model described in Example 2 (LR model - here referred to as “PREMIER-PERISCOPE”) and the benchmark clinical score GRACE 2.0 (applied to the same data), and corresponding Brier scores (BS). The “_all” suffix used in the figure indicates that the analysis includes all patients rather than a specific subgroup. The plots show the predicted vs true probability in each bin (top) and a box-and-whisker plot showing the distribution of predicted probabilities (bottom).
[0087] Fig. 19B shows a calibration curve for a 6-month mortality prediction model described in Example 2 (LR model - here referred to as “PREMIER-PERISCOPE”) and the benchmark clinical score GRACE 2.0, and corresponding Brier scores (BS). The “_all” suffix used in the figure indicates that the analysis includes all patients rather than a specific subgroup. The plots show the predicted vs true probability in each bin (top) and a box-and-whisker plot showing the distribution of predicted probabilities (bottom).
[0088] Fig. 19C shows a calibration curve for a 30-days readmission prediction model described in Example 2 (LR model- here referred to as “PREMIER-PERISCOPE”) and the benchmark clinical score LACE index, and corresponding Brier scores (BS). The “_all” suffix used in the figure indicates that the analysis includes all patients rather than a specific subgroup. The plots show the predicted vs true probability in each bin (top) and a box-and-whisker plot showing the distribution of predicted probabilities (bottom).
[0089] Fig. 20A shows results of a feature ablation analysis (impact of missing features) for an in-hospital mortality LR prediction model described in Example 2.
[0090] Fig. 20B shows results of a feature subgroup missingness analysis (impact of missing features) for an in- hospital mortality LR prediction model described in Example 2. Fig. 20C shows results of a feature ablation analysis (impact of missing features) for a 6-month mortality LR prediction model described in Example 2.
[0091] Fig. 20D shows results of a feature subgroup missingness analysis (impact of missing features) for a 6- month mortality LR prediction model described in Example 2.
[0092] Fig.20E shows results of a feature ablation analysis (impact of missing features) for a 30-days readmission LR prediction model described in Example 2.
[0093] Fig. 20F shows results of a feature subgroup missingness analysis (impact of missing features) for a 30- days readmission LR prediction model described in Example 2.
[0094] Detailed description
[0095] The present disclosure relates broadly to machine learning models that are predictive of prognosis in subjects that have been admitted to hospital and treated for STEMI.
[0096] In describing the present invention, the following terms will be employed, and are intended to be defined as indicated below.
[0097] The terms “subject” and “patient” are used herein interchangeably. A subject may be a human. The subject according to the present disclosure is a subject who has been identified as having or being likely to have STEMI. Diagnosis of STEMI may be through an ECG and / or troponin test. The subject may be a subject who has been admitted to hospital and treated for STEMI. As used herein "treatment" refers to reducing, alleviating or eliminating one or more symptoms of the disease or condition (e.g. STEMI) which is being treated, relative to the symptoms prior to treatment. Treatment for STEMI may include percutaneous coronary intervention (PCI). Thus, the subject may be a subject who has been treated with PCI. The subject may have been treated with PCI because of suspected or confirmed STEMI. The subject may be a subject who has been previously diagnosed as having heart failure. The subject may be in hospital at the point of assessment. The subject may be assessed after diagnosis and treatment of STEMI. The subject may be assessed prior to discharge from hospital. The subject may be in hospital following treatment for STEMI. Thus, the subject may be a subject who has been treated for STEMI and has been in hospital since the treatment and / or diagnosis of STEMI. The subject may be a subject who has been discharged from hospital following treatment for STEMI. The subject may be a subject who has been identified as having or being likely to have STEMI at most the same day, at most 1 day, 5 days, at most 6 days, at most 10 days, at most 20 days, at most 30 days, at most 1 month, at most 2 months, at most 3 months, at most 4 months, at most 5 months or at most 6 months prior to assessment using the methods of the disclosure. The subject may be a subject who has been treated for STEMI at 10 days, at most 20 days, at most 30 days, at most 3 months, prior to assessment using the methods of the disclosure.
[0098] Embodiments of the present disclosure make use of machine learning models. The machine learning models described herein are mathematical models trained (i.e. parameterized) to classify observations between a plurality of classes, i.e. classifiers (also referred to as “classification model”). A classification model may provide as output a classification label and / or one or more probabilities of an observation belonging to respective one or more classes. A binary classification model may provide as output a single probability or score indicating the probability that the observation belongs to a positive class (e.g. a class with a high risk of in hospital mortality, readmission and / or 6-month mortality). A predetermined threshold may be applied to the one or more probabilities, to assign a class label. Alternatively, an observation may simply be assigned to the class with highest probability. When using a binary classifier this may be equivalent to assigning the class for which the probability is over 50%. Further, an observation may be assigned to one of a low risk class, a medium risk class and a high risk class depending on whether the probability is under a first threshold (also referred to as “rule-out” threshold, identifying subjects at low risk of complication), between the first threshold and a second threshold, or above the second threshold (also referred to as “rule-in” threshold, identifying subjects at high risk of post-treatment complication). The predetermined threshold and / or first and second thresholds may be determined based on desired characteristics of the classification, such as e.g. a desired level of specificity, sensitivity or any accuracy performance that combines aspects of specificity (precision) and sensitivity (recall), such as accuracy and F1 score or balanced versions thereof that take into account the proportions of observations in the training data in each of the classes, and / or based on known prevalence of subjects in the category, as explained further elsewhere herein. Embodiments of the present disclosure make use of one or more binary classifiers. The one or more binary classifiers may each output a score. The score may be a score between 0 and 1 that can be interpreted as a probability that a subject has a predetermined complication. Any machine learning classifier known in the art may be used in the context of the present disclosure. Indeed, the present inventors have demonstrated that useful predictions can be obtained when using any classifier, from a simple logistic regression, to a non-linear tree-based model such as a gradient boosted tree model (e.g. CatBoost).
[0099] The term “machine learning algorithm” or “machine learning method” refers to an algorithm or method that trains and / or deploys a machine learning model. The machine learning models of the present disclosure are trained by supervised learning, i.e. using training data comprising observations for each of set of predictive variables for each of a plurality of training subjects, and a class label (also referred to as ground truth label), indicating whether a training subject suffered one or more post-treatment complications. Supervised learning refers to the parameterization of a model so as to optimize an objective criterion based on the comparison of predicted and ground truth labels for observations in the training data. An objective criterion may be the minimization of a loss function that quantifies the model prediction error based on the observed (ground truth) and predicted values of the predicted variables. Suitable loss functions for use in training machine learning models are known in the art and include the mean squared error, the mean absolute error, and regularized versions thereof. Any of these can be used according to the present disclosure. Regularized loss functions are functions that include a loss function as described above, and one or more terms penalizing model complexity in order to reduce the risk of overfitting. Overfitting is a phenomenon that occurs where a machine learning model is trained to very closely reproduce the features of a training data set, resulting in poorer performance on other datasets that do not have the same characteristics (i.e. poor generalizability). L1 regularization (also known as “Lasso” in the context of regression) add a regularization term to the loss function that penalizes models based on the sum of absolute value of the coefficients of the model. L2 regularization (also known as “Ridge” in the context of regression) add a regularization term to the loss function that penalizes models based on the sum of squared value of the coefficients of the model. L1 regularization can be used as a feature selection method as it minimizes the coefficients associated with less informative predictive features. In embodiments, the machine learning model is a regularized model. In embodiments, the machine learning model is a regularized tree-based model. In embodiments, the machine learning model is a regularized gradient boosted decision tree model. Examples of such models are available in the CatBoost software library (catboost.ai / ) and in the XGBoost software library (xgboost.ai / ). Such models may be referred to as “CatBoost” and “XGBoost” models, although any other implementation of regularized gradient boosted models may equally be used.
[0100] A machine learning model as described herein may be selected from: decision trees and variants thereof including regularized and / or gradient boosted decision trees and random forest models, regularized discriminant analysis, logistic regression models, generalized linear models, artificial neural networks (ANNs) including multilayer perceptrons (with linear or non-linear activation functions) and deep learning models (e.g. long short-term memory networks (LSTMs), Recurrent neural networks (RNNs)), naive Bayes classifiers, and support vector machines (SVM, using linear or non-linear kernels such as radial basis function). Non-linear models, such as decision trees and variants thereof (including in particular random forests and gradient boosted trees), SVM with a non-linear kernel, and ANNs (e.g. multilayer perceptrons) with non-linear activation functions may be particularly performant.
[0101] In embodiments, a machine learning model comprises an ensemble of models whose predictions are combined. Alternatively, a machine learning model may comprise a single model. Random forest models and gradient boosted tree models (such as XGBoost and CatBoost) are ensemble models. Ensemble versions of any of the models can be constructed. Ensemble models are expected to result in better prediction performance than single models and may therefore be advantageous in the context of the methods described herein. For example, the machine learning model may be a random forest classifier or a gradient boosted decision tree model. A random forest classifier is a model that comprises an ensemble of decision trees and outputs a class that is the average prediction of the individual trees. Decision trees perform recursive partitioning of a feature space until each leaf (final partition sets) is associated with a single value of the target. Gradient boosting is a machine learning method that forms an ensemble of weak prediction models (e.g. decision trees) from which a combined strong prediction is obtained. The algorithm iteratively adds new weak predictors to improve the prediction obtained by combining the outputs of the weak predictors. By contrast, random forest iteratively trains a set number of trees using random subsets of the training data. Conversely, very simple models such as linear regression models (also referred to herein as “generalized linear models”) can be useful to enable implementation by clinicians. In embodiments, the machine learning model is selected from: decision trees and variants thereof including regularized and / or gradient boosted decision trees and random forest models, logistic regression models. In embodiments, the machine learning model is a non-linear model. In embodiments, the machine learning model is an ensemble model. In embodiments, the machine learning model is a tree-based model.
[0102] Machine learning models of the present disclosure are trained to predict prognosis for a subject, the prognosis including an indication of the risk that the subject will suffer one or more complications (also referred to as “adverse events”). The one or more complications may be referred to as “post-treatment complications”, in the context of complications that may be experienced by a subject who has been admitted to hospital and treated for STEMI. The one or more complications may be selected from: mortality, and readmission within a first predetermined period of time. Readmission may refer to the risk that the patient is discharged then re-admitted within a first period of time from the time of assessment using a method as described herein. Mortality may be selected from in-hospital mortality and mortality within a second predetermined period of time. In-hospital mortality refers to mortality during the hospital stay following the STEMI patient receiving treatment for STEMI. Mortality may be all-cause mortality. Readmission may be all-cause readmission or re-admission for cardiac symptoms. Readmission may be all-cause readmission. The cardiac symptoms may include any known symptoms of cardiac dysfunction, such as e.g. symptoms of heart failure. Thus, re-admission for cardiac symptoms may encompass re-admission for suspected or confirmed cardiac dysfunction. The cardiac dysfunction may include one or more of: arrhythmia, heart failure. A STEMI patient refers to a patient who has been diagnosed with STEMI and has received treatment for STEMI. The treatment may be in a current (i.e. ongoing) or latest hospital stay.
[0103] A first and / or second predetermined period of time may be between 10 days and a year. The first and second predetermined periods of time may be the same or different. The first and / or second predetermined period may be at least 10 days, at least 20 days, at least 30 days, or at least a month. The first and / or second predetermined period may be at most a year, at most 6 months, at most 3 months. The first predetermined period may be between 10 days and 6 months. The first predetermined period may be between 10 days and 3 months. The first predetermined period may be between 20 days and 3 months. The first predetermined period may be 30 days. The second predetermined period may be between 10 days and 12 months, between 10 days and 6 months, between 1 month and 12 months, between 1 month and 9 months, between 1 month and 6 months, between 3 months and 9 months, or between 3 months and 6 months. The second predetermined period may be between 1 month and 12 months. The second predetermined period may be between 3 months and 6 months. The second predetermined period may be 6 months. A first predetermined period of time may be a period of time from discharge of the patient from hospital following treatment for STEMI. In other words, a first predetermined period of time may be a period of time from an index date which is the date of discharge of the patient from hospital. The date of discharge from hospital may be a proposed date of discharge from hospital. The date of discharge from hospital may be a current date. In other words, predicting risk of readmission to hospital within a predetermined period may comprise predicting the risk of readmission to hospital within a predetermined period from the current day (i.e. answering the question of if the patient was discharged today, what would be their risk of readmission within the first predetermined period of time). A second predetermined period of time may be a period of time from diagnosis of the patient with STEMI. The diagnosis with STEMI may be the latest diagnosis with STEMI preceding treatment of the subject for STEMI in the course of the latest hospital stay of the patient. In other words, the second predetermined period may run from the time of the latest STEMI diagnosis that was associated with the patient receiving treatment for STEMI. This is typically but does not need to be the same day as the patient receiving treatment for STEMI.
[0104] In embodiments, a machine learning model as described herein has been trained to classify patients between a plurality of categories comprising a first category associated with a high risk of in-hospital mortality and one or more further categories associated with lower risks of in-hospital mortality. In embodiments, the one or more further categories comprise a second category associated with a low risk of in-hospital mortality and a third category associated with a medium risk of in-hospital mortality. In embodiments, a machine learning model as described herein provides as output a probability ofthe patient experiencing in-hospital mortality. In embodiments, a machine learning model as described herein provides as output a probabilistic score indicative of the risk of the patient experiencing in-hospital mortality. In embodiments, the machine learning model has been trained to classify patients between a plurality of categories comprising a first category associated with a low risk of in-hospital mortality and one or more further categories associated with higher risks of in-hospital mortality. In embodiments, the one or more further categories comprise a second category associated with a high risk of in-hospital mortality and a third category associated with a medium risk of in-hospital mortality. In embodiments, the method further comprises outputting the predicted risk and / or whether the patient is in the low, medium or high risk category of in-hospital mortality after treatment of STEMI. In embodiments, the method further comprises selecting a treatment or monitoring plan based on the predicted risk and / or risk category of in-hospital mortality after treatment of STEMI. In embodiments, the method comprises selecting a first treatment and / or monitoring plan when the patient is classified in the low risk category, and selecting a second treatment and / or monitoring plan when the patient is classified in the high risk category of in-hospital mortality after treatment of STEMI.
[0105] In embodiments, a machine learning model as described herein has been trained to classify patients between a plurality of categories comprising a first category associated with a high risk of mortality within a second predetermined period of time and one or more further categories associated with lower risks of mortality within the second predetermined period of time. In embodiments, the one or more further categories comprise a second category associated with a low risk of mortality within the second predetermined period of time and a third category associated with a medium risk of mortality within the second predetermined period of time. In embodiments, a machine learning model as described herein provides as output a probability of the patient experiencing mortality within the second predetermined period of time. In embodiments, a machine learning model as described herein provides as output a probabilistic score indicative of the risk of the patient experiencing mortality within the second predetermined period of time. In embodiments, a machine learning model as described herein has been trained to classify patients between a plurality of categories comprising a first category associated with a low risk of mortality within the second predetermined period of time and one or more further categories associated with higher risks of mortality within the second predetermined period of time. In embodiments, the one or more further categories comprise a second category associated with a high risk of mortality within the second predetermined period of time and a third category associated with a medium risk of mortality within the second predetermined period of time. In embodiments, the method further comprises outputting the predicted risk and / or whether the patient is in the low, medium or high risk category of mortality within the second predetermined period of time. In embodiments, the method further comprises selecting a treatment or monitoring plan based on the predicted risk and / or risk category of mortality within the second predetermined period of time. In embodiments, the method comprises selecting a first treatment and / or monitoring plan when the patient is classified in the low risk category, and selecting a second treatment and / or monitoring plan when the patient is classified in the high risk category of mortality within the second predetermined period of time.
[0106] In embodiments, a machine learning model as described herein has been trained to classify patients between a plurality of categories comprising a first category associated with a high risk of readmission within a first predetermined period of time and one or more further categories associated with lower risks of readmission within the first predetermined period of time. In embodiments, the one or more further categories comprise a second category associated with a low risk of readmission within the first predetermined period of time and a third category associated with a medium risk of readmission within the first predetermined period oftime. In embodiments, a machine learning model as described herein provides as output a probability of the patient experiencing readmission within the first predetermined period oftime. In embodiments, a machine learning model as described herein provides as output a probabilistic score indicative of the risk of the patient experiencing readmission within the first predetermined period of time. In embodiments, a machine learning model as described herein has been trained to classify patients between a plurality of categories comprising a first category associated with a low risk of readmission within the first predetermined period of time and one or more further categories associated with higher risks of readmission within the first predetermined period of time. In embodiments, the one or more further categories comprise a second category associated with a high risk of readmission within the first predetermined period of time and a third category associated with a medium risk of readmission within the first predetermined period of time. In embodiments, the method further comprises outputting the predicted risk and / or whether the patient is in the low, medium or high risk category of readmission within the first predetermined period of time. In embodiments, the method further comprises selecting a treatment or monitoring plan based on the predicted risk and / or risk category of readmission within the first predetermined period of time. In embodiments, the method comprises selecting a first treatment and / or monitoring plan when the patient is classified in the low risk category, and selecting a second treatment and / or monitoring plan when the patient is classified in the high risk category of readmission within the first predetermined period of time.
[0107] In embodiments, classifying the patient between a plurality of categories comprises comparing the output of a machine learning model as described herein to one or more predetermined threshold. In embodiments, classifying the patient between a plurality of categories comprises comparing the output of the machine learning model to a first predetermined threshold, wherein the patient is classified in the first, high risk category when the output of the machine learning model is above the first threshold. In embodiments, classifying the patient between a plurality of categories comprises the output of the machine learning model to a second predetermined threshold, wherein the patient is classified in the second, low risk category when the output of the machine learning model is below the second threshold. In embodiments, the patient may be classified in the third, medium risk category when the output of the model is at or above the second predetermined threshold and at or below the first predetermined threshold. In embodiments, the first and second predetermined thresholds may be thresholds that have been identified such that patients in a reference cohort, optionally the training patients, classified in the first, second and third categories represent predetermined proportions of the reference cohort. The predetermined proportions may be proportions that are classified as low, medium and high risk, respectively, using a known risk classification score, such as the GRACE 2.0 score (e.g. for prediction mortality), or the LACE score (e.g. for prediction of readmission).
[0108] According to an embodiment of the presented disclosure, a machine learning model classifies subjects in three groups of low, medium and high probability of risk (also referred to herein as low, medium and high probability) of post-treatment complications of a STEMI patient. To classify the subjects into the three categories, cutoff values (also referred to herein as “threshold” or first / second predetermined values) may be determined separately for each of a plurality of complications (also referred to herein as adverse events). Cutoff values may be determined based on the proportions of subjects expected to be classified within a predetermined category of risk using a standard risk scoring method. For example, for mortality the standard risk scoring method may be the GRACE (Fox et al. 2006) or GRACE 2.0 score (Fox et al. 2014). As another example, for readmission the standard risk scoring method may be the LACE score (Walraven et al. 2010). Using such cutoffs ensures that the proportions of patients classified in specific risk categories (e.g. patients classified in the low risk category and / or patients classified in the high risk category) remains the same as with current standard of care approaches, which enables practical adoption of the algorithms without overwhelming the healthcare system or reducing the standard of care provided. However, because the present methods have better sensitivity and specificity than the current standard of care, the specific patients that are classified in these categories are more likely to be ones that have been correctly identified.
[0109] In embodiments, the machine learning model has been trained using a training data set comprising data for at least 1000, 5000 or 20000 patients who have been diagnosed as having STEMI and have received treatment for STEMI. In embodiments, the trained machine learning model has an AUC at least 0.6, 0.7, 0.75, or 0.8 when evaluated on unseen data. In embodiments, the method further comprises prior to providing the values said plurality of features as inputs to said machine learning model, applying one or more steps selected from: missing values imputation, filtering, standardization and normalization.
[0110] Also described herein are methods of providing trained machine learning models of the disclosure. In embodiments, a machine learning model of the disclosure may be retrained using new training data. For example, a set of predictive features of the disclosure may be used in combination with a machine learning model having any architecture of the disclosure and training data to obtain a trained machine learning model. A method of providing a tool for predicting the risk of readmission at hospital within a first predetermined period of time (e.g. 30-day) after discharge of a STEMI patient, may comprise: obtaining training data, wherein the training data comprises, for each of a plurality of patients who have been admitted to hospital and treated for STEMI (STEMI patients): (i) values of a plurality of features associated with the patient and (ii) an indication of whether the patient has been readmitted at hospital within a first predetermined period of time after discharge; training a machine learning model to predict the risk of readmission at hospital within the first predetermined period of time after discharge for a STEMI patient using said training data, wherein the machine learning model is trained to take as input the values of said plurality of features associated with a STEMI patient and to produce as output an indication of risk of readmission at hospital within the first predetermined period of time after discharge of the STEMI patient; wherein the plurality of features comprises a plurality of features selected from: patient demographics features, clinical history / comorbidities features, vital signs and / or laboratory tests features. The plurality of features may comprise any of the features described herein, and specifically any of the features described herein in the context of determining a risk of readmission at hospital within a first predetermined period of time (e.g. 30-day) after discharge of a STEMI patient.
[0111] A method of providing a tool for predicting the risk of in-hospital mortality after treatment of STEMI, may comprise: obtaining training data, wherein the training data comprises, for each of a plurality of patients who have been admitted to hospital and treated for STEMI (STEMI patients): (i) values of a plurality of features associated with the patient and (ii) a corresponding indication of whether the patient has suffered from in-hospital mortality after treatment of STEMI; training a machine learning model to predict the risk of in-hospital mortality after treatment of STEMI using said training data, wherein the machine learning model is trained to take as input the values of said plurality of features associated with a STEMI patient and to produce as output an indication of risk of in-hospital mortality after treatment of STEMI; wherein the plurality of features comprises a plurality of features selected from: patient demographics features, clinical history / comorbidities features, vital signs and / or laboratory tests features. The plurality of features may comprise any of the features described herein, and specifically any of the features described herein in the context of determining a risk of in-hospital mortality of a STEMI patient.
[0112] A method of providing a tool for predicting the risk of mortality within a second predetermined period of time (e.g. 6 months) after treatment of STEMI, may comprise: obtaining training data, wherein the training data comprises, for each of a plurality of patients who have been admitted to hospital and treated for STEMI (STEMI patients): (i) values of a plurality of features associated with the patient and (ii) corresponding indication of whether the patient has died within the second predetermined period of time after treatment of STEMI; training a machine learning model to predict the risk of mortality within the second predetermined period of time after treatment of STEMI using said training data, wherein the machine learning model is trained to take as input the values of said plurality of features associated with a patient and to produce as output an indication of a risk of mortality within the second predetermined period of time after treatment of STEMI; wherein the plurality of features comprises a plurality of features selected from a: patient demographics features, clinical history / comorbidities features, vital signs and / or laboratory tests features.
[0113] The methods of the present disclosure make use of a plurality of features associated with a subject which have been identified as described herein as predictive of risk of post-treatment complications in STEMI patients. Any such feature may be referred to herein as a “predictive feature”, “input variable”, or simply “feature” or “input”. A feature refers to a variable for which a value is provided as input to a machine learning model and based on which a prognosis prediction is made. Any feature may be encoded as a Boolean feature, a categorical feature or a numerical feature. Features may be selected from one or more categories of features selected from: patient demographic features, hospital admission history features, clinical history (also referred to as “comorbidities”) features, vital signs features and laboratory tests features. Patient demographic features refer to features that define a demographic characteristic of the subject. Patient demographic features may include a subject’s age, marital status, and gender. In embodiments, patient demographic features include a subject’s age. In embodiments, patient demographic features include a subject’s age. In embodiments, patient demographic features for the prediction of in-hospital mortality may include: a feature indicative of patient’s age. In embodiments, patient demographic features for the prediction of mortality within a second predetermined period of time may include: a feature indicative of patient’s age. Hospital admission history features are features that characterize the current or historical nature and / or frequency of a patient’s admission to hospital. For example, hospital admission history features can comprise one or more of: a feature indicative of the number of emergency department visits made by the subject in a predetermined period of time prior to assessment using a method of the disclosure, a feature indicative of the duration of the current or latest hospital admission, and a feature indicative of whether the current or latest admission is an acute emergency admission. In embodiments, hospital admission history features comprise a feature indicative of the number of emergency department visits made by the subject in a predetermined period of time prior to assessment using a method of the disclosure. In embodiments, hospital admission history features for the prediction of readmission within a first predetermined period of time may include: a feature indicative of the number of emergency department visits made by the subject in a predetermined period of time prior to assessment using a method of the disclosure.
[0114] A clinical history feature is a feature that indicates whether the subject has been diagnosed as having one or more predetermined clinical conditions (also referred to as “comorbidities”). In embodiments, a clinical history feature refers to a feature that indicates whether the subject has been diagnosed as having one or more predetermined clinical conditions (also referred to as “comorbidities”) at any time prior to an index date (e.g. date of STEMI diagnosis for mortality prediction, and date of discharge or assessment for readmission prediction). The one or more predetermined clinical conditions may be selected from: Paraplegia and Hemiplegia, Atrial Arrhythmia, CABG (Coronary Artery Bypass Grafting) or PCI (Percutaneous Coronary Intervention), Cardiogenic Shock, Cardiomyopathy Diagnosis, Coronary Artery Disease, Dyslipidemia, Hypertension, Pacemaker or Defibrillator, Peripheral Arterial Disease, Pulmonary Edema, Rales and / or Jugular Venous Distention (JVD), Stroke or TIA (Transient Ischemic Attack), Valvular Heart Disease, Ventricular Arrhythmia, Cancer, Cerebrovascular Disease, Chronic Pulmonary Disease, Congestive Heart Failure, Connective Tissue or Rheumatic Disease, Dementia, Diabetes Without Chronic Complications, Diabetes With Chronic Complications, Human Immunodeficiency Virus (HIV) or Acquired Immunodeficiency Syndrome (AIDS), Metastatic Carcinoma, Mild Liver Disease, Moderate or Severe Liver Disease, Myocardial Infarction, Peptic Ulcer Disease, Peripheral Vascular Disease, Renal Disease, Elevated Troponin, Cardiac Arrest, STEMI. In embodiments, the one or more predetermined clinical conditions are selected from: atrial arrhythmia, congestive heart failure, rales and / or Jugular venous distension (JVD), cerebrovascular disease, cardiac arrest, cardiogenic shock. In embodiments, a clinical history features for the prediction of in-hospital mortality include one or more (or all) of: a feature indicative of whether the patient has suffered cardiac arrest, a feature indicative of whether the patient has suffered cardiogenic shock, a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered atrial arrhythmia, and / or a feature indicative of whether the patient has suffered cerebrovascular disease. In embodiments, a clinical history features for the prediction of mortality within a second predetermined period of time may include one or more (or all) of: a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered cardiac arrest a feature indicative of whether the patient has suffered cerebrovascular disease, a feature indicative of whether the patient has suffered rales and / or jugular venous distension (JVD), a feature indicative of whether the patient has suffered atrial arrhythmia, and / or a feature indicative of whether the patient has suffered cardiogenic shock. A vital signs feature refers to a feature that characterizes a vital sign of the subject, such as e.g. a pulse, respiration, or blood pressure related measurement. A vital signs feature may be a vital signs measurement or a value derived therefrom that is indicative of the measurement. A vital signs feature may be selected from: a pulse feature (e.g. oxygen (02) saturation), and a blood pressure feature (e.g. systolic blood pressure, diastolic blood pressure). In embodiments, a vital signs feature includes 02 saturation and / or systolic blood pressure. In embodiments, vital signs features for the prediction of in-hospital mortality include one or more (or all) of: a feature indicative of 02 saturation levels and / or a feature indicative of systolic blood pressure. In embodiments, vital signs features for the prediction of mortality within a second predetermined period of time may include: a feature indicative of systolic blood pressure.
[0115] A laboratory test feature is a feature that characterizes the results of one or more laboratory tests, where laboratory tests are performed on a sample previously obtained from a subject and measuring the presence or abundance of one or more predetermined markers. A laboratory test feature may be a laboratory test measurement or a value derived therefrom that is indicative of the measurement. A laboratory test feature is a feature that characterizes the results of one or more laboratory tests performed at a latest date prior to an index date, and / or within a predetermined period prior to the index date. When a plurality of test results are available (e.g. when multiple results are available associated with the same date or within the predetermined period), a summarised metric such as a median, mode or average value may be used. In embodiments, a vital signs feature is a feature that characterizes a vital sign of the subject measured at a latest date prior to an index date, and / or within a predetermined period prior to the index date. When a plurality of test results are available (e.g. when multiple results are available associated with the same date or within the predetermined period), a summarised metric such as a median, mode or average value may be used. A laboratory test and laboratory test feature may be referred to herein by the name of the predetermined marker(s) that it measures. For example, a troponin I laboratory test refers to a test that measures the abundance of troponin I in a sample. Similarly, a troponin I feature refers to a feature that characterizes the results of such a test. The predetermined markers may be selected from: cardiac markers (e.g. troponin I), renal markers (e.g. albumin, creatinine, blood urea nitrogen), electrolytes (e.g. sodium, potassium), metabolic markers (high density lipoprotein, low density lipoprotein, glucose, total cholesterol, triglyceride), and hematologic markers (white blood cells, platelets, hemoglobin). In embodiments, laboratory test features for the prediction of in-hospital mortality include one or more (or all) of: a feature indicative of serum creatinine level, a feature indicative of blood urea nitrogen (BUN) level, a feature indicative of white blood cell count, a feature indicative of platelet count, and / or a feature indicative of albumin levels in the blood. In embodiments, laboratory test features for the prediction of mortality within a second predetermined period of time include one or more (or all) of: a feature indicative of hemoglobin level, a feature indicative of serum creatinine level, a feature indicative of blood urea nitrogen (BUN) level, a feature indicative of white blood cell count, a feature indicative of platelet count, and / or a feature indicative of albumin levels in the blood. In embodiments, laboratory test features for the prediction of readmission within a first predetermined period oftime include one or more (or all) of: a feature indicative of hemoglobin level, a feature indicative of serum creatinine level, a feature indicative of blood urea nitrogen (BUN) level, a feature indicative of sodium level, a feature indicative of HDL cholesterol level, and / or a feature indicative of albumin levels in the blood.
[0116] In embodiments, the predictive features used in models of the disclosure (including e.g. features relating to patient-specific data of demographics, hospital admission history / status, clinical history / comorbidities, vital signs and laboratory tests) are selected or have been selected based on their predictive power and clinical relevance. Further information on each of the above-mentioned features is provided below, and in Table 1, which provides for feature terms used, their corresponding category, examples of data types (i.e. how values for each feature can be encoded), a description, and examples of values.
[0117] Table 1. Description of predictive features used in methods of the disclosure.
[0118]
[0119] In embodiments, a feature indicative of the subject’s age is a numerical feature (e.g. an integer), e.g. 21 , 58, 67, 75 or 90 , representing the subject's age in years. Alternatively the input feature for age can be a numerical feature (e.g. integer) indicating the subject’s year of birth, or the subject’s age expressed in months, weeks or days. Further alternatives for a feature indicative of age is a value indicative of age based on categorical binning, e.g. subject belonging to one of a plurality of non-overlapping age ranges, such as e.g. decades. Such a feature may be expressed as a numerical value (e.g. each of one or more predetermined age ranges being associated with a different predetermined value) or a Boolean value (e.g. indicating for each of one or more predetermined age ranges, whetherthe subject belongs to the age range. An older age of a patient may correlate with increased health risks of the patient. A feature indicative of marital status may be a categorical feature, such as unknown, single, married. Social support levels for the patient may impact the health outcomes of the patient. A feature indicative of gender (sex) may be a categorical feature, such as male, female, other. Patients gender can influence risk for certain conditions.
[0120] In embodiments, a hospital admission history / status feature is an integer feature. A feature indicative of the number of emergency department (ED) visits in a predetermined period of time prior to assessment may be provided as an integer representing the number of such visits. The predetermined period of time may be selected between 3 months and 12 months. The predetermined period of time may be selected from 1 year, 11 months, 10 months, 9 months, 8 months, 7 months, 6 months, 5 months, 4 months or 3 months. The predetermined period of time may be 6 months. For example, the feature indicative of the number of emergency department (ED) visits in a predetermined period may be the number of emergency department (ED) visits in the last 6 months provided as an integer representing the number of visits in the ED in the last 6 months, e.g. 0, 1 , 2. Higher numbers of ED visits of a patient may indicate more frequent acute health issues. A feature indicative of admission duration may be provided as an integer representing the number of days since the subject has been admitted in the current or latest hospital stay, e.g. 5, 10, 15. Longer hospital stays of a patient may indicate more severe conditions. A feature indicative of whether the subject has been admitted at the current or latest admission as an acute emergency admission may be an integer or Boolean value. A Boolean value may indicate a true if the subject was admitted as an acute emergency admission, and false otherwise. An integer value may assign a first value if the subject was admitted as an acute emergency admission, and a second value otherwise. For example, the feature may equal 0 if “False” and the emergency admission is not acute, and 1 if “True” and the emergency admission is acute. Alternatively, for example, the feature may be assigned Integer values according to the LACE index clinical score, wherein the feature may equal 0 if “False” and the emergency admission is not acute, and 3 if “True” and the emergency admission is acute. If admission of a patient has been an emergency admission it may indicate acute conditions.
[0121] In embodiments, a feature indicative of comorbidities / clinical history may be a Boolean, such as False or True (or corresponding numerical encoding including a first value e.g. 0 for false and a second value e.g. 1 for true), representing if a patient has been diagnosed with one or more predetermined conditions. Definitions of the prior patient clinical history may be based on the International Classification of Diseases of the medical billing codes (https: / / www.aapc.com / codes / code-search / ). The medical billing codes are alphanumeric codes used to uniquely identify medical procedures, diagnoses, equipment, and services for uniform documentation purposes. In embodiments, a feature indicative of clinical history / comorbidities is based on the International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM), which is a standardized system used to code diseases and medical conditions (morbidity) data. The definitions of the features belonging to the category of patient clinical history are further described below. In embodiments, a feature indicative of clinical history / comorbidities is an alphanumeric value indicating a disease code. In embodiments, a feature indicative of clinical history / comorbidities is a Boolean value indicating whether the subject has been identified as having a particular predetermined condition. In embodiments, a feature indicative of clinical history / comorbidities is a Boolean value indicating whether the subject has been identified as having any one of a set comprising a plurality of predetermined conditions. In embodiments, a feature indicative of clinical history / comorbidities is a feature derived from clinical history information about a subject, such as e.g. from an electronic health records database. In embodiments, a feature indicative of clinical history / comorbidities is a feature derived by determining whether a set of conditions that the subject has been identified as having includes one or more predetermined conditions, wherein conditions are recorded as ICD codes. The one or more predetermined conditions may be selected from: cardiac diseases or disorders (atrial arrhythmia, congestive heart failure, cardiac arrest, cardiogenic shock), vascular diseases or disorders (e.g. peripheral vascular disease, Rales and / or Jugular Venous Distention, cerebrovascular disease), renal diseases or disorders (e.g. renal disease), hypertension, liver diseases or disorders (e.g. liver disease) and metabolic diseases or disorders (e.g. dyslipidemia, diabetes, etc.). The one or more predetermined conditions may include any one or more of the following: atrial arrhythmia, congestive heart failure, Rales and / or Jugular Venous Distention (JVD), cerebrovascular disease, cardiac arrest, cardiogenic shock.
[0122] Whether a patient has suffered cardiac arrest may indicate a previous event where the heart stopped beating. When a patient has suffered atrial arrhythmia, it may indicate irregular heartbeat that can lead to blood clots, stroke, and heart failure. When a patient has suffered cardiogenic shock, it may indicate a condition where the heart suddenly can't pump enough blood. When a patient has suffered cerebrovascular disease it may indicate diseases related to blood vessels and blood supply to the brain. When a patient has suffered congestive heart failure, it may indicate a chronic condition affecting the pumping power of heart muscles. When a patient has suffered from rales and / or jugular venous distention (JVD), it may indicate signs that may indicate heart failure or lung problems. When a patient has suffered peripheral vascular disease, it may lead to reduced blood flow. When a patient has suffered from hypertension, it may lead to heart disease and other complications. When a patient has suffered from dyslipidemia, it may lead to increased risk of heart disease. When a patient has suffered from renal disease it may affect kidney function and overall health. If a patient has suffered from liver disease it may affect liver function and overall health. When a patient has suffered from Paraplegia and Hemiplegia it may indicate impairment in motor function. When a patient has undergone CABG (Coronary Artery Bypass Grafting) or PCI (Percutaneous Coronary Intervention) it may indicate past coronary intervention. When a patient has suffered from Cardiomyopathy Diagnosis it can affect patients heart muscle function. Coronary Artery Disease is one of the leading cause of death globally. If a patient has pacemaker or defibrillator it may indicate management of heart rhythm issues. When a patient has suffered from Peripheral Arterial Disease, it may lead to reduced blood flow. When a patient has suffered from Pulmonary Edema it can cause breathing difficulties. When a patient has suffered from Stroke or TIA (Transient Ischemic Attack) it may indicate past cerebrovascular events. When a patient has suffered from Valvular Heart Disease it can affect heart function. Ventricular Arrhythmia may be life-threatening if not managed properly. When a patient has suffered from cancer it may indicate a group of diseases involving abnormal cell growth. When a patient has suffered from Chronic Pulmonary Disease it can affect lung function. When a patient has suffered from Connective Tissue or Rheumatic Disease it can affect joints and organs. When a patient suffers from Dementia it may indicate affected cognitive function. Diabetes Without Chronic Complications may indicate affected blood sugar regulation. When a patient has suffered from Diabetes With Chronic Complications it may indicate progression of diabetes. When a patient has suffered from Human Immunodeficiency Virus (HIV) or Acquired Immunodeficiency Syndrome (AIDS) it may indicate affected immune system function. When a patient has suffered from Metastatic Carcinoma it may indicate an advanced stage of cancer. When a patient has suffered from Myocardial Infarction it may indicate past heart attack. When a patient has suffered from Peptic Ulcer Disease it can affect digestive health. When a patient has suffered from elevated Troponin it may indicate a heart muscle injury. Thus, any of the above may be risk factors that contribute to the risk of complication after treatment for STEM I.
[0123] In embodiments, the Boolean of one or more clinical history / comorbidities features may indicate whether a patient has been diagnosed with one or more predetermined conditions within a predetermined period preceding an index date. The predetermined period may be referred to as the “lookback period”. The index date may be the date at which the subject was diagnosed with STEMI. The index date may be the latest date at which the subject was diagnosed with STEMI, e.g. when the subject has experienced multiple STEMI episodes. Such index dates may be particularly useful in the context of predicting a risk of mortality. The index date may be the date of discharge from the hospital after treatment for STEMI, or the date of assessment using methods of the disclosure, when the subject has not yet been discharged from hospital at the time of assessment using the methods of the disclosure. Such index dates may be particularly useful in the context of predicting a risk of readmission. The lookback period may be between 1 day and the entire lifetime of the patient (e.g. 123 years or 44895 days, highest recorded age), between 1 day and 90 years, between 1 day and 30 years, 1 day and 365 days, between 1 day and 180 days, between 10 days and 60 days, between 20 and 30 days. The lookback period may be the entire lifetime of a patient (in years or days). The values of one or more clinical history / comorbidities features may be extracted from a patient’s health record by searching for particular disease codes (e.g. ICD codes) and / or keywords associated with a predetermined condition. For example, the value of a clinical history feature associated with the presence of congestive heart failure may be obtained by searching a subject’s health records for the term “congestive heart failure” or for specific ICD codes that fall under the definition of congestive heart failure in one or more versions of the ICD categorization. Instead or in addition to this, the values of one or more clinical history / comorbidities features may be received as input from a user, or received from a user interface, database or other computing device as a previously extracted feature.
[0124] The term “Congestive heart failure” as used herein refers to a condition, where the heart of the subject is unable to pump blood effectively, leading to a buildup of fluid in the lungs and other tissues. The EHRbased data definition for scanning ICD codes of patient clinical history: ICD-10 Code: I50.9 (Heart failure, unspecified); Specific Types: Congestive Heart Failure, Unspecified: I50.30 (Unspecified diastolic (congestive) heart failure); Acute Congestive Heart Failure: 150.31 (Acute diastolic (congestive) heart failure); Chronic Congestive Heart Failure: I50.32 (Chronic diastolic (congestive) heart failure). Thus, a clinical history / comorbidities feature indicative of whether a patient has been diagnosed with congestive heart failure may be set to a first value when a EHR associated with the subject comprises any one or more of the above ICD codes, and / or when a EHR associated with the subject comprises the expression “congestive heart failure” (or any abbreviations thereof, e.g. CHF) or other keyword or code associated with congestive heart failure according to a predetermined dictionary, and a second value otherwise. Alternatively, a value for such a feature may simply be received, retrieved or obtained that is indicative of whether the subject has been diagnosed with congestive heart failure.
[0125] The term “Rales and / or Jugular Venous Distention (JVD)” as used herein refers to the presence of either or both, rales and JVD, suggesting potential underlying cardiovascular or respiratory pathology. The EHRbased data definition for scanning ICD codes of patient clinical history: ICD-10 Code: R01.2 (Other abnormal heart sounds); Specific Types: Rales: R09.89 (Other specified symptoms and signs involving the circulatory and respiratory systems); JVD: I87.8 (Other specified disorders of veins and lymphatic vessels); Combined Condition: R01.2 (Other abnormal heart sounds). Thus, a clinical history / comorbidities feature indicative of whether a patient has been diagnosed with rales and / or JVD may be set to a first value when a EHR associated with the subject comprises any one or more of the above ICD codes, and / or when a EHR associated with the subject comprises any one or more of the expressions “rale”, “jugular venous distension”, and abbreviated forms thereof (e.g. JVD) or other keyword or code associated with rales and / or JVD according to a predetermined dictionary, and a second value otherwise. Alternatively, a value for such a feature may simply be received, retrieved or obtained that is indicative of whether the subject has been diagnosed with rales and / or JVD.
[0126] The term “Cerebrovascular Disease” as used herein refers to a group of disorders affecting the blood vessels and blood supply to the brain, which can result in strokes, transient ischemic attacks (TIAs), and other neurological impairments. The EHR-based data definition for scanning ICD codes of patient clinical history, specifically ICD-10 Codes: G45.* (Transient cerebral ischemic attacks and related syndromes); G46.* (Vascular syndromes of brain in cerebrovascular diseases); H34.0 (Transient retinal artery occlusion); I60.* (Nontraumatic subarachnoid hemorrhage); 161 .* (Nontraumatic intracerebral hemorrhage); I62.* (Other nontraumatic intracranial hemorrhage); I63.* (Cerebral infarction); I64.* (Stroke, not specified as hemorrhage or infarction); I65.* (Occlusion and stenosis of precerebral arteries, not resulting in cerebral infarction); I66.* (Occlusion and stenosis of cerebral arteries, not resulting in cerebral infarction); I67.* (Other cerebrovascular diseases); I68.* (Cerebrovascular disorders in diseases classified elsewhere); I69.* (Sequelae of cerebrovascular disease). Thus, a clinical history / comorbidities feature indicative of whether a patient has been diagnosed with cerebrovascular disease may be set to a first value when a EHR associated with the subject comprises any one or more of the above ICD codes, and / or when a EHR associated with the subject comprises any one or more of the expressions “cerebrovascular disease”, and abbreviated forms thereof (e.g. CRBVD) or other keyword or code associated with cerebrovascular disease according to a predetermined dictionary, and a second value otherwise. Alternatively, a value for such a feature may simply be received, retrieved or obtained that is indicative of whether the subject has been diagnosed with cerebrovascular disease.
[0127] The term “Cardiac Arrest” as used herein refers to a sudden stop in effective blood circulation in the subject due to the failure of the heart to contract effectively or at all. The EHR-based data definition for scanning ICD codes of patient clinical history: (i) ICD-10 Codes: I46.2 (Cardiac arrest due to underlying cardiac condition); I46.8 (Other cardiac arrest); I46.9 (Cardiac arrest, unspecified); (ii) ICD-9 Code: 427.5 (Cardiac arrest); (iii) Specific Types: Cardiac Arrest due to Underlying Cardiac Condition: I46.2; Other Cardiac Arrest: I46.8; Unspecified Cardiac Arrest: I46.9. Thus, a clinical history / comorbidities feature indicative of whether a patient has been diagnosed with cardiac arrest may be set to a first value when a EHR associated with the subject comprises any one or more of the above ICD codes, and / or when a EHR associated with the subject comprises any one or more of the expressions “cardiac arrest”, and abbreviated forms thereof (e.g. CA) or other keyword or code associated with cardiac arrest according to a predetermined dictionary, and a second value otherwise. Alternatively, a value for such a feature may simply be received, retrieved or obtained that is indicative of whether the subject has been diagnosed with cardiac arrest. The term “Cardiogenic Shock” as used herein refers to a state of inadequate tissue perfusion due to the heart's inability to pump sufficient blood in the subject, often resulting from severe heart damage. The EHRbased data definition for scanning ICD codes of patient clinical history: ICD-10 Code: R57.0 (Cardiogenic shock); Specific Types: Cardiogenic Shock: R57.0 (Cardiogenic shock). Thus, a clinical history / comorbidities feature indicative of whether a patient has been diagnosed with cardiogenic shock may be set to a first value when a EHR associated with the subject comprises any one or more of the above ICD codes, and / or when a EHR associated with the subject comprises any one or more of the expressions “cardiogenic shock ”, and abbreviated forms thereof (e.g. CS) or other keyword or code associated with cardiac arrest according to a predetermined dictionary, and a second value otherwise. Alternatively, a value for such a feature may simply be received, retrieved or obtained that is indicative of whether the subject has been diagnosed with cardiogenic shock.
[0128] The term “Atrial Arrhythmia” as used herein refers to a condition characterized by an irregular or abnormal heart rhythm originating in the atria, the upper chambers of the heart. This condition can lead to various complications such as stroke and heart failure if not managed properly. The EHR-based data definition for scanning ICD codes of patient clinical history is as follows: ICD-10 Code for Atrial Arrhythmia: I48: Atrial fibrillation and flutter; I48.*: Specific types of atrial fibrillation and flutter (e.g., persistent, permanent, paroxysmal); I49.9*: Unspecified cardiac arrhythmia. Thus, a clinical history / comorbidities feature indicative of whether a patient has been diagnosed with atrial arrhythmia may be set to a first value when a EHR associated with the subject comprises any one or more of the above ICD codes, and / or when a EHR associated with the subject comprises any one or more of the expressions “atrial arrhythmia”, “atrial fibrillation”, “cardiac arrhythmia”, and abbreviated forms thereof (e.g. AA) or other keyword or code associated with atrial arrhythmia according to a predetermined dictionary, and a second value otherwise. Alternatively, a value for such a feature may simply be received, retrieved or obtained that is indicative of whether the subject has been diagnosed with atrial arrhythmia.
[0129] The term “Peripheral Vascular Disease” as used herein refers to a circulatory condition characterized by a subject presenting narrowed blood vessels and / or reducing blood flow to the limbs. The EHR-based data definition for scanning ICD codes of patient clinical history, specifically ICD-10 codes: I70.* (Atherosclerosis); 171.* (Aortic aneurysm and dissection); 173.1 (Thromboangiitis obliterans [Buerger's disease]); I73.8 (Other specified peripheral vascular diseases); I73.9 (Peripheral vascular disease, unspecified); 177.1 (Stricture of artery); I79.0 (Aneurysm of artery in diseases classified elsewhere); I79.2 (Peripheral angiopathy in diseases classified elsewhere); K55.1 (Chronic vascular disorders of intestine); K55.8 (Other vascular disorders of intestine); K55.9 (Vascular disorder of intestine, unspecified); Z95.8 (Presence of other vascular implants and grafts); Z95.9 (Presence of unspecified vascular implant and graft). Thus, a clinical history / comorbidities feature indicative of whether a patient has been diagnosed with peripheral vascular disease may be set to a first value when a EHR associated with the subject comprises any one or more of the above ICD codes, and / or when a EHR associated with the subject comprises any one or more of the expressions “peripheral vascular disease”, and abbreviated forms thereof (e.g. AA) or other keyword or code associated with peripheral vascular disease according to a predetermined dictionary, and a second value otherwise. Alternatively, a value for such a feature may simply be received, retrieved or obtained that is indicative of whether the subject has been diagnosed with peripheral vascular disease.
[0130] The term “Hypertension” as used herein refers to a condition, where the force of the blood against the artery walls is too high, which can lead to heart disease, stroke, and other health problems. The EHR-based data definition for scanning ICD codes of patient clinical history: 110.* (Primary); Specific Types: Hypertensive Heart Disease: 111.9 (Hypertensive heart disease without heart failure); Hypertensive Chronic Kidney Disease: 112.9 (Hypertensive chronic kidney disease with stage 1 through stage 4 chronic kidney disease, or unspecified chronic kidney disease). Thus, a clinical history / comorbidities feature indicative of whether a patient has been diagnosed with hypertension may be set to a first value when a EHR associated with the subject comprises any one or more of the above ICD codes, and / or when a EHR associated with the subject comprises any one or more of the expressions “hypertension”, “hypertensive heart disease”, “hypertensive chronic kidney disease”, “hypertensive”, and abbreviated forms thereof (e.g. AA) or other keyword or code associated with hypertension according to a predetermined dictionary, and a second value otherwise. Alternatively, a value for such a feature may simply be received, retrieved or obtained that is indicative of whether the subject has been diagnosed with hypertension.
[0131] The term “Renal Disease" as used herein refers to a condition of an impaired kidney function, leading to the accumulation of waste products and fluid imbalances in the body. The EHR-based data definition for scanning ICD codes of patient clinical history: (i) ICD-10 Code: N18.9 (Chronic kidney disease, unspecified) (ii) Specific Types: Chronic Kidney Disease, Stage 1 : N18.1 ; Chronic Kidney Disease, Stage 2 (Mild): N18.2; Chronic Kidney Disease, Stage 3 (Moderate): N18.3; Chronic Kidney Disease, Stage 4 (Severe): N18.4; Chronic Kidney Disease, Stage 5: N18.5; End-Stage Renal Disease: N18.6. Thus, a clinical history / comorbidities feature indicative of whether a patient has been diagnosed with renal disease may be set to a first value when a EHR associated with the subject comprises any one or more of the above ICD codes, and / or when a EHR associated with the subject comprises any one or more of the expressions “renal disease”, “kidney disease”, and abbreviated forms thereof (e.g. AA) or other keyword or code associated with renal disease according to a predetermined dictionary, and a second value otherwise. Alternatively, a value for such a feature may simply be received, retrieved or obtained that is indicative of whether the subject has been diagnosed with renal disease.
[0132] The term “Liver Disease" as used herein refers to a spectrum of liver conditions that are generally less severe, including fatty liver disease, mild hepatitis, and early-stage liver fibrosis or cirrhosis. The EHRbased data definition for scanning ICD codes of patient clinical history for specific types: (i) Chronic Viral Hepatitis: B18; (ii) Alcoholic Liver Disease: Alcoholic Fatty Liver: K70.0; Alcoholic Hepatitis: K70.1 ; Alcoholic Fibrosis and Sclerosis of Liver: K70.2; Alcoholic Cirrhosis of Liver: K70.3; Alcoholic Liver Disease, Unspecified: K70.9; (iii) Toxic Liver Disease: Chronic Persistent Hepatitis: K71 .3; Chronic Lobular Hepatitis: K71.4; Chronic Active Hepatitis: K71.5; Fibrosis and Cirrhosis of Liver: K71 .7; Chronic Hepatitis**: K73; (iv) Fibrosis and Cirrhosis of Liver**: K74; (v) Other Liver Conditions: Fatty Liver (not elsewhere classified): K76.0; Central Hemorrhagic Necrosis of Liver: K76.2; Infarction of Liver: K76.3; Peliosis Hepatis: K76.4; Other Specified Diseases of Liver: K76.8; Liver Disease, Unspecified: K76.9; Liver Transplant Status**: Z94.4. Thus, a clinical history / comorbidities feature indicative of whether a patient has been diagnosed with liver disease may be set to a first value when a EHR associated with the subject comprises any one or more of the above ICD codes, and / or when a EHR associated with the subject comprises any one or more of the expressions “liver disease ”, “fatty liver”, “liver fibrosis”, “fibrosis of the liver”, “hepatitis”, “liver disease”, “liver transplant”, and abbreviated forms thereof (e.g. LD) or other keyword or code associated with liver disease according to a predetermined dictionary, and a second value otherwise. Alternatively, a value for such a feature may simply be received, retrieved or obtained that is indicative of whether the subject has been diagnosed with liver disease.
[0133] In embodiments, an input variable that is indicative of vital signs is a numerical feature (also referred to as float) or an integer. For example, a feature indicative of 02 saturation may be a numeric value (float) representing the percentage of hemoglobin that is saturated with oxygen in the blood of the subject, e.g. 61 .5, 94.8, 98 or 100 [%]. Alternatively, the feature indicative of 02 saturation can be provided as an indication of range of a plurality of non-overlapping ranges (e.g. categorical binning) of % 02 saturation in the blood of the subject. The ranges may be defined by reference to expected values of 02 saturation in a healthy subject and / or in a subject with hypoxemia. The feature indicative of 02 saturation may be a categorical variable indicating whether the subject’s 02 saturation value is within a first range associated with a healthy subject, or in another range, such as e.g. a second range lower than the first range and associated with subjects having hypoxemia. Thus, the feature may be a categorical variable that indicates whether the subject’s 02 saturation value belongs to a certain category, e.g. normal or subject having hypoxemia and others. Low levels of 02 saturation may indicate respiratory or cardiac issues. A feature indicative of systolic or diastolic blood pressure may be a numerical feature, e.g. a float or an integer. As an example, a feature indicative of Systolic blood pressure (SBP) may be a float representing the continuous numerical value of the actual systolic BP measurement in mmHg, e.g. 54, 126, 143 or 233 [mmHg]. As another example, a feature indicative of diastolic blood pressure (DBP) may be an integer representing the continuous numerical value of the actual diastolic BP measurement in mmHg, e.g. 80 [mmHg]. Alternatively, for a feature indicative of systolic and / or diastolic BP can be provided as a categorical feature indicative of whether the subject’s systolic I diastolic blood pressure value belongs to one of a predetermined set of nonoverlapping ranges, such as e.g. ranges associated with low, normal, prehypertension, hypertension, etc. For example, a feature indicative of systolic and / or diastolic BP can be provided as a categorical feature indicative of whether the subject’s systolic / diastolic blood pressure value belongs to a first range associated with normal blood pressure, or any one or more other ranges, such as a second range associated with low blood pressure (hypotension), and a third range associated with high blood pressure (hypertension). High values of systolic blood pressure may indicate hypertension. High values of diastolic blood pressure may indicate hypertension.
[0134] In embodiments, a laboratory test feature is a numerical feature (also referred to as float), representing a measured amount of a biomarker, expressed as a concentration value or count of the biomarker per volume of sample. The concentration or count of the biomarker in a sample may be measured with prescribed methods, which are known to a skilled person in the field. The sample may be a blood sample such as a whole blood, serum or plasma sample. The sample may be a serum sample. Unless context indicates otherwise (such as e.g. where cell counts are measured, which may be measured in whole blood or in a processed blood sample from which plasma has been separated out), a laboratory test feature may be a feature measured in a serum sample. For example, a feature indicative of blood urea nitrogen (BUN) may be a numeric value (float) representing the urea nitrogen levels in the blood of the patient, e.g. 4, 20, 30, 134 [mg / dL]. High levels of BUN can indicate kidney dysfunction. As an example, a feature indicative of hemoglobin may be a numeric value (float) representing the hemoglobin levels in the blood of the patient, e.g. 4.1 , 10.45, 12.5, 14.3, 19.6 [g / dL]. Hemoglobin is critical for oxygen transport, low levels of hemoglobin may indicate anemia. As an example, a feature indicative of HDL (high-density lipoprotein) may be a numeric value (float) representing the HDL cholesterol level of a patient, e.g. 39, 48, 32 [mg / dL]. Higher HDL levels generally may be better, and may indicate lower risk of heart disease. Additionally, higher HDL levels, high total cholesterol level, triglyceride level may indicate a high risk factor for heart disease. As an example, a feature indicative of platelet may be a numeric value (float) representing the platelet count, e.g. 13, 221 .5, 277, 887 [x10A3 / uL]. A feature indicative of platelet may be measured in a whole blood sample or in a processed blood sample, such as e.g. a buffy coat sample. Low levels of platelet count may lead to bleeding issues, high levels of platelet count may suggest clotting risks. As an example, a feature indicative of serum albumin may be a numeric value (float) representing the serum albumin levels in the blood, e.g. 1.2, 3.6, 4.0, 5.5 [g / dL]. As an example, a feature indicative of serum creatinine may be a numeric value (float) representing the creatinine levels in blood, e.g. 0.4, 1.1 , 1 .5, 9.9 [mg / dL]. Elevated levels of patients' serum creatinine may indicate kidney dysfunction. As an example, a feature indicative of sodium may be a numeric value (float) representing the sodium level, e.g. 138, 140, 136 [mmol / L]. Abnormal levels of sodium may indicate electrolyte imbalances. As an example, a feature indicative of White Blood Cell (WBC) may be a numeric value (float) representing the white blood cell count, e.g. 0.14, 9.62, 13.1 , 97.72 [x10A3 / uL]. A feature indicative of white blood cell may be measured in a whole blood sample or in a processed blood sample, such as e.g. a buffy coat sample. High levels of WBC may be indicative of infection or inflammation. As an example, a feature indicative of Troponin I may be a numeric value (float) representing the Troponin I concentration, e.g. 0.01 , 0.03 [ng / mL] in a serum sample. Elevated levels of Troponin I concentration may indicate heart muscle injury. As an example, a feature indicative of glucose may be a numeric value (float) representing the glucose level, e.g. 90, 110 [mg / dL] in a serum sample. Glucose levels may be important for diagnosing and managing diabetes. As an example, a feature indicative of potassium may be a numeric value (float) representing the Potassium level, e.g. 3.5, 4.0, 5.0 [mEq / L] in a serum sample. Potassium level may be critical for nerve and muscle function.
[0135] In embodiments, a laboratory test feature can be associated with Logical Observation Identifiers Names and Codes (LOINC). LOINC is a universal standard for unique identifiers of medical laboratory and clinical observations. Each LOINC code is unique and corresponds to specific medical tests, measurements, or observations. Thus, in embodiments, values of laboratory test features may be obtained by extracting data (e.g. from an electronic health record database) associated with one or more predetermined LONIC codes. Using LONIC codes to extract data for laboratory test features may ensure consistency and interoperability across different healthcare systems and laboratories.
[0136] In embodiments, the values of one or more laboratory tests and / or vital signs features may be associated with measurements that have been acquired within a predetermined period preceding an index date. The predetermined period may be referred to as the “lookback period”. In embodiments, the values of one or more laboratory tests and / or vital signs features may be associated with the latest measurement indicative of the feature within the lookback period. The index date may be the date at which the subject was diagnosed with STEMI. The index date may be the latest date at which the subject was diagnosed with STEMI, e.g. when the subject has experienced multiple STEMI episodes. Such index dates may be particularly useful in the context of predicting a risk of mortality . The index date may be the date of discharge from the hospital after treatment for STEMI, or the date of assessment using methods of the disclosure, when the subject has not yet been discharged from hospital at the time of assessment using the methods of the disclosure. Such index dates may be particularly useful in the context of predicting a risk of readmission. The lookback period may be between 1 day and 365 days, between 1 day and 180 days, between 10 days and 60 days, between 10 days and 40 days, between 20 and 30 days, or about 30 days. The lookback period may be 30 days.
[0137] A subject who has been identified as having a high risk of complication (poor prognosis) may be selected for treatment with a first treatment plan. The first treatment plan may include treatment with a first therapy or combination of therapies. The first treatment plan may include a first monitoring frequency. A subject who has been identified as having a low risk of complication (good prognosis) may be selected for treatment with a second treatment plan. The second treatment plan may include a second therapy or combination of therapies. The second treatment plan may include a second monitoring frequency. The second monitoring frequency may be lower than the first monitoring frequency. The first and / or second therapies may be therapies that are currently used in the clinic but indicated for different types of subjects or diseases, or experimental therapies. The second treatment plan may include a discharge from the hospital. The first treatment plan may include a delay of discharge from the hospital. The delay of discharge from the hospital may be for a predetermined period. The subject may be reassessed after the predetermined period. The delay of discharge from the hospital may be until further testing has been performed. The first and / or second treatment plan may be chosen according to clinical guidelines, such as ACCF / AHA guidelines for release and management of patients with STEMI (www.ahajournals.org / doi / 10.1 161 / cir.0b013e3182742cf6; O’Gara et al. 2012), European Society of Cardiology (ESC) 2017 Guidelines (academic. oup. com / eurheartj / article / 39 / 2 / 1 19 / 4095042?login=false; Ibanez et al. 2018) or ESC 2023 Guidelines for management of acute coronary syndrome (ACS) (academic. oup. com / ehjacc / article / 13 / 1 / 55 / 7280662; Byrne et al. 2024), both of which are incorporated herein by reference. For example, a subject who has been identified as having a high risk of complication (poor prognosis) may be selected for treatment with a first treatment plan indicated for subjects at moderate I high risk according to the ACCF / AHA guidelines for release and management of patients with STEMI. For example, a subject who has been identified as having a low risk of complication (good prognosis) may be selected for treatment with a second treatment plan comprising an early discharge and / or transfer to a different hospital (e.g. a hospital more local to the subject). Early discharge ortransfer may refer to a transfer or discharge within 24h, 48h or 72 h of intervention. As another example, a subject who has been identified as having a high risk of complication (poor prognosis) may undergo secondary prevention interventions, including the use of cardiac rehabilitation, aspirin, lipid-lowering therapy, beta blockers, and ACE (Angiotensin-Converting Enzyme) inhibitors. A subject who has been identified as having a high risk of complication (poor prognosis) may be selected for a coronary angiography prior to discharge. Such a patient may be selected for further intervention based on the results of the coronary angiography. A subject who has been identified as having a high risk of complication (poor prognosis) may be selected for inclusion in a cardiac rehabilitation and / or secondary prevention program. Monitoring may comprise visits to a healthcare provider. A subject who has been identified as having a high risk of complication (poor prognosis) may be selected for prolonged ECG monitoring compared to a subject who has been identified as having a low risk of complication. For example, the former may be selected for ECG monitoring for more than 24 hours (e.g. 48 hours or more) after intervention. The latter may be selected for ECG monitoring for 24 hours or less, or less than 48 hours after intervention.
[0138] The systems and methods described herein may be implemented in a computer system, in addition to the structural components and user interactions described. As used herein, the term “computer system” includes the hardware, software and data storage devices for embodying a system or carrying out a method according to any of the described embodiments. For example, a computer system may comprise a central processing unit (CPU), input means, output means and data storage, which may be embodied as one or more connected computing devices. Preferably the computer system has a display or comprises a computing device that has a display to provide a visual output display. The data storage may comprise RAM, disk drives or other computer readable media. The computer system may include a plurality of computing devices connected by a network and able to communicate with each other over that network. It is explicitly envisaged that a computer system may consist of or comprise a cloud computer. The methods described herein may be provided as computer programs or as computer program products or computer readable media carrying a computer program which is arranged, when run on a computer, to perform the method(s) described herein. As used herein, the term “computer readable media” includes, without limitation, any non-transitory medium or media which can be read and accessed directly by a computer or computer system. The media can include, but are not limited to, magnetic storage media such as floppy discs, hard disc storage media and magnetic tape; optical storage media such as optical discs or CD-ROMs; electrical storage media such as memory, including RAM, ROM and flash memory; and hybrids and combinations of the above such as magnetic / optical storage media.
[0139] In embodiments, a first set of features used to predict the risk of in-hospital mortality comprises any one or more or all of the features in Table 22. In embodiments, a first set of features used to predict the risk of in-hospital mortality comprises: one or more demographic features comprising a feature indicative of the patient’s age; a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, and a feature indicative of whether the patient has suffered cardiogenic shock, a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered cerebrovascular disease; a plurality of laboratory test features comprising: a feature indicative of the patient’s white blood cell count, and a feature indicative of the patient’s serum albumin level. In embodiments, a first set of features used to predict the risk of in-hospital mortality comprises any one or more or all of the features in Table 7.
[0140] In embodiments, a second set features used to predict the risk of mortality within a predetermined period of time after treatment of STEMI (e.g. 6-month mortality) comprises any one or more or all of the features in Table 23. In embodiments, a second set features used to predict the risk of mortality within a predetermined period of time after treatment of STEMI (e.g. 6-month mortality) comprises: one or more demographic features comprising a feature indicative of the patient’s age, a plurality of clinical history features comprising a feature indicative of whether the patient has suffered congestive heart failure, and a feature indicative of whether the patient has suffered cardiac arrest, a feature indicative of whether the patient has suffered cardiogenic shock, and a feature indicative of whether the patient has suffered cerebrovascular disease; and a plurality of laboratory test features comprising a feature indicative of the patient’s serum albumin level, a feature indicative of the patient’s white blood cell count, and a feature indicative of the patient’s blood urea nitrogen (BUN) level. In embodiments, a second set features used to predict the risk of mortality within a predetermined period of time after treatment of STEMI (e.g. 6-month mortality) comprises any one or more or all of the features in Table 8.
[0141] In embodiments, a third set of features used to predict the risk of readmission at hospital within a predetermined period of time after discharge (e.g. 30-day readmission) comprises any one or more or all of the features in Table 24. In embodiments, a third set of features used to predict the risk of readmission at hospital within a predetermined period of time after discharge (e.g. 30-day readmission) comprises: a plurality of hospital admission history features comprising a feature indicative of a number of emergency department visits made by the patient in a predetermined period of time prior to the predicting, and a feature indicative of the patient’s admission duration; a plurality of clinical history features comprising a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered renal disease, a feature indicative of whether the patient has suffered chronic pulmonary disease; a plurality of laboratory test features comprising a feature indicative of the patient’s hemoglobin level, and a feature indicative of the patient’s blood urea nitrogen (BUN) level. In embodiments, a third set of features used to predict the risk of readmission at hospital within a predetermined period of time after discharge (e.g. 30-day readmission) comprises any one or more or all of the features in Table 9.
[0142] Figure 2 is a flow diagram showing, in schematic form, a method of providing a prognosis or treatment recommendation for a subject according to the disclosure. At optional step 10, demographic, clinical and / or laboratory data is obtained about a subject, such as e.g. from an electronic health records database, a memory, user interface or other computing device, as described herein. At step 14, it is determined whether the subject has a poor or intermediate / good prognosis (high vs medium / low risk of complication) and / or whether the subject has a good or intermediate / poor prognosis (low vs medium / high risk of complication), using methods described herein. This may comprise a step 14A of determining a probabilistic score (e.g. a probability) of the subject experiencing a complication using a machine learning model that has been trained to take as input values of predictive features as described herein and provide as output such a probability, using training data comprising values of said predictive features for a plurality of STEMI patients comprising a plurality of STEMI patients that experienced the complication and a plurality of STEMI patients that did not experience the complication. At optional step 14B, the patient may be classified between a plurality of classes by comparing the score obtained at step 14A to one or more predetermined thresholds, as described herein. At step 18, one or more results of this analysis may optionally be provided to a user through a user interface. At optional step 20, a particular course of treatment (which may comprise one or more different individual therapies and / or monitoring schedules) may be identified based on the results of step 14, as described herein. At optional step 22, the subject may be treated with the therapy identified at step 20.
[0143] Figure 3 shows an embodiment of a system for providing a prognosis or treatment recommendation, according to the present disclosure. The system comprises a computing device 1 , which comprises a processor 101 and computer readable memory 102. The computing device 1 may be a computing device associated with a healthcare practitioner (not shown). In the embodiment shown, the computing device 1 also comprises a user interface 103, which is illustrated as a screen but may include any other means of conveying information to a user such as e.g. through audible or visual signals. The computing device 1 is communicably connected, such as e.g. through a network, to one or more databases 2 storing demographic data and / or vitals data and / or laboratory test data and / or clinical data. For example, the computing device 1 may be communicably connected to one or more electronic health records databases 2. The one or more databases 2 may further store one or more of: parameters (such as e.g. thresholds or parameters of a machine learning algorithm trained to predict prognosis), patient and / or sample related information, etc. The computing device may be a smartphone, tablet, personal computer or other computing device. The computing device is configured to implement a method as described herein. In alternative embodiments, the computing device 1 is configured to communicate with a remote computing device (not shown, such as e.g. a server), which is itself configured to implement a method as described herein and provide one or more results thereof to computing device 1 . In such cases, the remote computing device may also be configured to send the result of the method to the computing device. Communication between the computing device 1 and the remote computing device may be through a wired or wireless connection, and may occur over a local or public network 6 such as e.g. over the public internet. The database 2 may be in wired connection with the computing device 1 , or may be able to communicate through a wireless connection, such as e.g. through WiFi and / or over the public internet, as illustrated. The connection between the computing device 1 and the database 2 may be direct or indirect (such as e.g. through a remote computer).
[0144] Any one or more of the prognostic prediction methods described herein may be implemented as part of an algorithm hosting platform, designed for seamless integration with Electronic Medical Records (EMR, also referred to herein as Electronic Health Record, EHR) systems, such as a system as illustrated in Fig. 1A. The system may be configured to request patient data from an EMR and process it as described herein to generate valuable clinical insights i.e. prognostic predictions). The system may comprise an algorithm hosting platform, comprising one or more computer processors and one or more non-transitory memories storing instructions for implementing methods as described herein. These may include instructions to implement each of the methods of predicting a post-treatment complication selected from in-hospital mortality, mortality within a predetermined period of time, and re-admission within a predetermined period of time. Each such set of instructions may be referred to as an “algorithm”. Each algorithm within the algorithm risk system can be utilized independently, offering flexibility that allows clinicians to use it according to their specific needs and clinical practices. The algorithm hosting platform may further be configured to receive information (e.g. in the form of a user request) through a user interface (optionally via an API) indicating which of the algorithms to execute. The algorithm hosting platform may further be configured to output a result of the algorithm(s) executed, through a user interface (optionally via an API). For example, one or more insights output by the algorithm(s) may be displayed directly on a clinician's computer, providing an intuitive and comprehensive view of the patient's health status. By utilizing this solution, clinicians can make more informed decisions, resulting in improved patient outcomes and streamlined clinical workflows. Overall, the integration of this advanced algorithm system with EMR systems not only enhances the decision-making process for clinicians but also contributes to more efficient and effective patient care. Fig. 1 B shows an example scenario in which methods and systems of the disclosure may be used. A patient may be admitted to hospital with one or more symptoms of STEMI. The patient may be diagnosed as having STEMI, e.g. via an ECG and / or troponin test. The patient may then receive treatment for STEMI, e.g. percutaneous intervention. After treatment, the risk of a first post-treatment complication, e.g. in-hospital mortality may be predicted. The risk of a second post-treatment complication, e.g. 6-months mortality may also be predicted. The first and second predictions may be repeated at one or more times during hospitalization of the patient. The patient may then be discharged and a third posttreatment complication, e.g. 30-day readmission risk may be predicted. The first and second predictions may be used to determine whether the patient can be discharged. The third prediction may be used to determine how the patient should be monitored following discharge. The process may be repeated every time the subject is re-admitted with symptoms of STEMI.
[0145] The following is presented by way of example and is not to be construed as a limitation to the scope of the claims.
[0146] Examples
[0147] The invention will now be described by exemplary methods. In particular, the examples below demonstrate a method of training a machine learning model and determining relevant features for risk stratification of STEMI patients for post-treatment complications (Fig. 4). Furthermore, the examples below demonstrate the performance of methods to for risk stratification of STEMI patients for post-treatment complications, specifically predict the risk of in-hospital mortality after treatment of STEMI, 6-month mortality after treatment of STEMI, and the risk of 30-day readmission at hospital after discharge of the patient.
[0148] Example 1
[0149] The present example describes the training and evaluation of machine learning models for predicting risk of post-treatment complications using data from electronic health records in a cohort of heart failure patients.
[0150] 1.1. Materials and methods
[0151] 1 .1.1 . Data and features
[0152] Training data. For the development of these ML algorithms, real world EHR data from heart failure patients obtained from TrinetX (trinetx.com) were used which provided de-identified data from networks and healthcare organizations (HCO) and other data providers located in North America. Specifically the data consisted of patients with known incidents of heart failure diagnoses between 2016 and 2019. The dataset comprised a total population of N = 400003 subjects, with N = 23629 subjects diagnosed with STEMI, N = 10753 subjects having inpatient diagnosis (STEMI diagnosis was made in the hospital). Only the N = 23629 subjects diagnosed with STEMI were used for ML model training purposes. The prevalence of patients corresponding to the three different post-treatment complication endpoints in the data set are: 12% of in- hospital mortality, 23% of 6-month mortality, and 28% of 30-day readmission. The laboratory test features were represented using LOINC in the TrinetX dataset, therefore LOINC codes were used to extract these values. The composition of the cohort of the training data obtained from the TrinetX dataset, specifically the balanced patient gender distribution, a wide patient age range, varied regional locations, and the patient ethnic distribution, are shown in Fig. 5. Fig.5A shows the cohort composition for the in-hospital outcome, Fig. 5B - the 6-month mortality outcome and Fig. 5C the 30-day readmission outcome.
[0153] Label definition. In order to extract relevant features from the electronic health records (EHR) data source the inventors first defined index dates for each outcome (Fig. 6). An index date is a reference time point when the prediction takes place and defines the start of a prediction period and the start of an EHR lookback period. For prediction of both the in-hospital and the 6-month mortality outcome the index date was chosen to be the date of STEMI diagnosis and treatment. For prediction of the 30-day readmission outcome the index date was chosen to be the date of the patient discharge. The 30-day readmission risk refers to the risk that the patient is discharged from hospital admission when STEMI was diagnosed and re-admitted due to all-cause readmission. Additionally, the inventors accounted for the possibility that a patient could be admitted to the hospital multiple times, and could be diagnosed with a STEMI multiple times. Each admission is treated as a separate instance / admission. Based on these index definitions, the inventors then defined the EHR lookback period, which is a 30-day lookback period for the laboratory test and the vital sign features relative to the index date. If there are multiple values of the features indicative of laboratory tests and / or vital signs within the lookback period, the values which are associated with the latest measurement of the feature within the lookback period are selected. The hospital admission history / status features were chosen from the time period leading up to the index date. The lookback period for a feature indicative of the number of emergency department (ED) visits was 6 months relative to the index date. Other hospital admission history / status features, such as a feature indicative of admission duration representing the number of days since the subject has been admitted in the current hospital stay, were chosen from the day of admission of the current hospital stay until the index date. No time restrictions were applied for features associated with comorbidities, demographics. All data entries were associated with a date (i.e. precision of a day), with no actual time of measurement. If during a certain day a laboratory test and / or vital sign feature and / or features have been measured multiple times, a median of the multiple measurements / entries of the feature throughout the day is calculated and used as input variable.
[0154] Data splitting. The prevalence of patients corresponding to the three different post-treatment complication endpoints in the data set are: 12% of in-hospital mortality, 23% of 6-month mortality, and 28% of 30-day readmission. Before training each of the distinct ML models for each of the defined endpoints, the data was split into a training set and a tuning set with a 80 / 20 ratio every time (Fig. 7). To maintain the prevalence of the positive and negative labels for each of the endpoints in both the training and the tuning sets, the patients were stratified during the split (Tables 2-4). Each of the distinct ML models was trained using a training set corresponding to 80% of the data and evaluated with a tuning set corresponding to 20% of the data. The training set further underwent A-fold nested cross-validation (CV). A A-fold nested cross-validation consists of two-levels of cross-validation: an outer loop and inner loop. The outer loop is used for model evaluation using AUC and the training dataset is divided in k equally sized folds, where the model is trained iteratively on k-1 of the folds and tested on the remaining fold of the dataset. The inner loop is used for hyperparameter tuning and with each iteration of the outer loop a nested (inner) cross-validation is performed on the subset of k-1 folds of the training data, wherein for each combination of hyperparameters the data is split into k folds and model is trained on k-1 folds and tested on the remaining fold. The best- performing hyperparameters are then used to train the model on the entire training set of the outer loop.
[0155] The “k" represents a number of equally sized subsets / folds of the dataset, specifically between 5-10 subsets / folds, where the model is trained on “k-1" subsets / folds and tested on the remaining subset / fold of the dataset. Specifically k may be 5. Specifically k may be 10.
[0156] AUC (Area Under the Receiver Operating Characteristic Curve, or sometimes referred to as AUC-ROC) is a scalar value representing the overall performance of the classifier and reflects the model's ability to discriminate between positive and negative classes. AUC value can lie between 0.0 and up to 1 .0, where value less than 0.5 indicates that the model performs worse than random guess, and 1 .0 is an ideal AUC value representing a perfect model that correctly classifies all positive and negative instances without any errors. Table 2. Data stratification retaining the prevalence of positive and negative labels for training a ML model for predicting in-hospital mortality after treatment of STEM!
[0157] Table 3. Data stratification retaining the prevalence of positive and negative labels fortraining a ML model for predicting 6-month mortality after treatment of STEM!
[0158] Table 4. Data stratification retaining the prevalence of positive and negative labels for training a ML model for predicting 30-day readmission at hospital after discharge of the patient
[0159] Handling of missing data. The comprehensive feature set comprised 70 features. To ensure the robustness and reliability of the dataset, outlier detection was performed, wherein the values out of range were set to as missing values. To handle missing values, the inventors removed features which had more than 60% of missing values. The remaining missing numerical variables were imputed using median imputation. In the median imputation technique the missing value is replaced with the median of the nonmissing values of the particular feature. The remaining missing categorical variables were imputed using mode imputation. In the mode imputation technique the missing value is replaced with the mode, which is the most frequently occurring value in the dataset for that particular feature. This approach ensures that the imputed values are representative of the underlying data distribution, thereby enhancing the model's performance and reliability. The mode and median values were derived from the training split data.
[0160] After removing features which had more than 60% of missing values, the remaining combined feature count is 58 features, which was screened as described further below to find the final predictive feature lists for the different post-treatment complication endpoints.
[0161] The features considered in the feature selection process for the 30-day readmission and mortality data were slightly different. After the removal of more than 60% missing features, there were 54 features considered for the 30-day readmission’s endpoint prediction model (Table 5). In difference to features considered for mortality endpoint prediction models, the features considered for 30-days readmission endpoint prediction model additionally comprised: emergency admission, LDL, admission duration, dementia, HDL, total cholesterol, peptic ulcer disease, triglyceride, diabetes with chronic complications. The index date being the day of hospital discharge allowed for the incorporation of these features. Features, such as emergency admission and admission duration, which are also used as input features in the LACE index, could be calculated accurately. Additionally, other features listed had less than 60% missingness due to the index date being at the end of the hospitalization or they were kept because they were required for the calculation of the Charles Comorbidity Index (CCI).
[0162] After the removal of more than 60% missing features, 48 features were considered for mortality endpoint prediction models (Table 6). In difference to features considered for 30-days readmission endpoint prediction model, the features considered for mortality endpoint prediction models additionally comprised GRACE specific features: cardiac arrest, STEMI, elevated troponin and Killip class.
[0163] The following features were used in feature selection for both, the 30 days readmission model and the mortality models: serum creatinine, atrial arrhythmia, systolic blood pressure, troponin I, valvular heart disease, myocardial infarction, cardiomyopathy diagnosis, metastatic carcinoma, dyslipidemia, rales and / or jugular venous distention, peripheral arterial disease, ventricular arrhythmia, hemoglobin, blood urea nitrogen, emergency department visits last 6 months, cerebrovascular disease, pulmonary edema, sodium, cardiogenic shock, stroke or transient ischemic attack, white blood cell count, hypertension, diabetes without chronic complications, oxygen saturation, marital status, peripheral vascular disease, chronic pulmonary disease, pacemaker or defibrillator, mild liver disease, renal disease, moderate or severe liver disease, sex, congestive heart failure, coronary artery bypass grafting or percutaneous coronary intervention, coronary artery disease, cancer, paraplegia and hemiplegia, diastolic blood pressure, human immunodeficiency virus or acquired immunodeficiency syndrome, glucose, potassium, serum albumin, connective tissue disease or rheumatic disease, platelet, age.
[0164] Table 5. Description of predictive features used in feature selection process for predicting mortality (in- hospital and / or 6-month)
[0165] 1 .1 .2. Feature selection process
[0166] According to the example described herein, the features were selected based on their predictive power for each of the endpoints separately - in-hospital mortality after treatment of STEMI, 6-month mortality after treatment of STEMI, and the risk of 30-day readmission at hospital after discharge of the patient. Feature selection was performed in two stages: (A) initial model training and SHAP (SHapley Additive exPlanations) value based feature ranking, (B) iterative feature selection with stop criteria to identify the most relevant features (Fig. 4). Additionally, clinical experts were consulted on the selected feature list to ensure that the features are sensible and are also in line with clinical practice of data collection. The feature selection method is outlined in more detail below.
[0167] A. Initial Model Training and SHAP value based Feature Ranking
[0168] The first stage of the feature selection process comprises the initial model training and SHAP value based feature ranking. The initial model used for feature selection was a CatBoost model, although other architectures were also tested after feature selection. The SHAP values represent how much each feature contributes to the predicted value of the target, taking into account all other features in the same instance.
[0169] The first stage of the feature selection process included the following steps: (i) training the CatBoost model using 5-fold nested cross-validation on all of the features; (ii) computation of SHAP values for all of the features using the in-built function in the CatBoost package; (iii) ranking the (X) number of features based on the average absolute SHAP values.
[0170] A 5-fold nested cross-validation consists of two-levels of cross-validation: an outer loop and inner loop. The outer loop is used for model evaluation based on the highest AUC value and the training dataset is divided in 5 equally sized folds, where the model is trained iteratively on 4 of the folds and tested on the remaining fold of the dataset. The inner loop is used for hyperparameter tuning and with each iteration of the outer loop a nested (inner) cross-validation is performed on the subset of 4 folds of the training data, wherein for each combination of hyperparameters the data is split into 5 folds and model is trained on 4 folds and tested on the remaining fold. The best-performing hyperparameters are then used to train the model on the entire training set of the outer loop.
[0171] B. Iterative Feature Selection with stop criteria
[0172] The second stage of the feature selection process comprised determining the optimal number of features by evaluating the model's performance (AUC value). The second stage of the feature selection process includes the following steps: (i) retrieving the (X) features ranked based on the average absolute SHAP values of the first stage described above; and for each of the features in the (X) feature list iteratively repeating the steps (ii)-(iv), wherein the steps are (ii) training the model on the feature set ([K]), comprising the (K) top features of the (X) features, using the training set and 5-fold nested cross-validation; (iii) evaluating and storing the mean and standard deviation of AUC; (iv) adding a feature (K+1) to the feature set ([K, K+1]), comprising the (K+1) top features of the (X) features; the steps (ii)-(iv) are repeated until the current mean AUC does not improve significantly compared to the previous two mean AUC values, or all (X) features are included in the feature set (Fig. 8); (v) retrieving the top final features for the final model. Significance is assessed using an ANOVA test with an alpha threshold set at 0.95.
[0173] Fig. 8A illustrates the AUC curve of in-hospital mortality feature forward selection. The first stop in the feature forward selection process for in-hospital mortality risk prediction occurred at feature number 15, indicating that adding additional features beyond this point did not result in a significant change in AUC when considering the threshold. As the AUC did not significantly change beyond the 13th feature, the inventors selected features up to the 13th feature, ensuring that the model would be efficient and effective. Catboost model comprising the first 4 features (a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, and a feature indicative of whether the patient has suffered cardiogenic shock, a plurality of laboratory test features comprising: a feature indicative of the patient’s white blood cell count, and a feature indicative of the patient’s serum albumin level) already achieved higher AUC than the benchmark clinical score GRACE 2.0 (AUC=0.74, Fig. 1 1 A), and a Catboost model comprising the first 3 features (a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, and a feature indicative of whether the patient has suffered cardiogenic shock, one or more laboratory test features comprising: a feature indicative of the patient’s white blood cell count) already achieved a very similar AUC to the benchmark clinical score GRACE 2.0, and includes the features with a strong impact on model prediction according to SHAP feature analysis (see Fig. 9A). The data on fig. 8A further show that a model using features comprising a plurality of clinical history features including a feature indicative of whether the patient has suffered cardiac arrest and a feature indicative of whether the patient has suffered cardiogenic shock, and a plurality of laboratory test features including a feature indicative of the patient’s white blood cell count, and a feature indicative of the patient’s serum albumin level, and one or more demographic features comprising a feature indicative of the patient’s age (top 5 features) performs extremely well. Similarly, models including the first 6, the first 7, the first 8, the first 9, or the first 10 features on Fig. 8A are explicitly envisaged and expected to be beneficial.
[0174] Fig. 8B illustrates the AUC curve of 6-month mortality feature forward selection. In the case of the 6-month mortality model, the feature forward selection process stopped at the 16th feature. As a result, we selected up to the 14th feature, as further additions did not yield significant improvements in AUC. Catboost model comprising the first 4 features (a plurality of clinical history features comprising a feature indicative of whether the patient has suffered congestive heart failure, and a feature indicative of whether the patient has suffered cardiac arrest, and a demographic features comprising a feature indicative of the patient’s age, and a laboratory test features comprising a feature indicative of the patient’s serum albumin level) already achieved higher AUC than the benchmark clinical score GRACE 2.0 (AUC=0.71 , Fig. 1 1 B). Similarly, models including the first 5, the first 6, the first 7, the first 8, the first 9, or the first 10 features on Fig. 8B are explicitly envisaged and expected to be beneficial.
[0175] Fig. 8C illustrates the AUC curve of 30-day readmission feature forward selection. For the 30-day readmission model, the feature forward selection process halted at the 9th feature. Consequently, the first 7 features were selected, as the AUC did not exhibit significant changes beyond this point. Catboost model comprising the first 2 features (a hospital admission history feature comprising a feature indicative of a number of emergency department visits made by the patient in a predetermined period of time prior to the predicting, and a laboratory test features comprising a feature indicative of the patient’s hemoglobin level) already achieved higher AUC than the benchmark clinical score LACE index (AUC=0.57, Fig. 11 C). Similarly, models including the first 3, the first 4, the first 5, the first 6, the first 7, the first 8, the first 9, or the first 10 features on Fig. 8C are explicitly envisaged and expected to be beneficial.
[0176] The final feature sets for each of the endpoints, determined using the exemplary method described herein, are listed below (Tables 7-9). The SHAP values of the final feature set are shown in Fig. 9.
[0177] Table 7 shows the final feature list for predicting the risk of in-hospital mortality after treatment of STEMI: a plurality of clinical history features: a feature indicative of whether the patient has suffered cardiac arrest, a feature indicative of whether the patient has suffered cardiogenic shock, a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered atrial arrhythmia, and a feature indicative of whether the patient has suffered cerebrovascular disease; and a plurality of laboratory test features: a feature indicative of the patient’s white blood cell count, and a feature indicative of the patient’s serum albumin level a feature indicative of the patient’s serum creatinine level, a feature indicative of the patient’s blood urea nitrogen (BUN) level, and a feature indicative of patient’s platelet count; and plurality of vital signs features: a feature indicative of the patient's systolic blood pressure, and a feature indicative of the patient's oxygen saturation level; and a demographic features comprising a feature indicative of the patient’s age.
[0178] Table 8 shows the final feature list for predicting the risk of 6-month mortality after treatment of STEMI: a plurality of clinical history features comprising a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered cardiac arrest a feature indicative of whether the patient has suffered cerebrovascular disease, a feature indicative of whether the patient has suffered rales and / or jugular venous distension (JVD), a feature indicative of whether the patient has suffered atrial arrhythmia, and a feature indicative of whether the patient has suffered cardiogenic shock; and a demographic features comprising a feature indicative of the patient’s age; and plurality laboratory test features comprising a feature indicative of the patient’s serum albumin level, a feature indicative of the patient’s blood urea nitrogen (BUN) level, a feature indicative of the patient’s white blood cell count, a feature indicative of the patient’s serums creatinine level, a feature indicative of the patient’s platelet count, and a feature indicative of patient hemoglobin level; and a vital signs feature comprising a feature indicative of the patient's systolic blood pressure.
[0179] Table 9 shows the final feature list for predicting the risk of 30-day readmission at hospital after discharge of the patient: a hospital admission history feature comprising a feature indicative of a number of emergency department visits made by the patient in a predetermined period of time ( 6 months) prior to the predicting; an plurality of laboratory test features comprising a feature indicative of the patient’s hemoglobin level, a feature indicative of the patient’s serum creatinine level, a feature indicative of the patient’s blood urea nitrogen (BUN) level, a feature indicative of the patient’s blood high density lipoprotein (HDL) level, a feature indicative of the patient’s blood sodium level, and a feature indicative of the patient’s serum albumin level.
[0180] Table 7. List of the final feature set (13 features) for training a ML model for predicting risk of in-hospital mortality after treatment of STEM I.
[0181] Table 8. List of the final feature set (14 features) for training a ML model for predicting risk of 6-month mortality after treatment of STEM I.
[0182] Table 9. List of the final feature set (7 features) for training a ML model for predicting risk of 30-day readmission at hospital after discharge of the patient.
[0183] 1.1 .3. Training the model
[0184] Aset of distinct ML models were trained to predict the risk of in-hospital mortality after treatment of STEMI, 6-month mortality after treatment of STEMI, and the risk of 30-day readmission at hospital after discharge of the patient. In the examples described herein, for each of the endpoints the inventors have trained the following models: a plurality of decision tree based models, specifically CatBoost model, XGBoost model and a RF model; and a regression model, specifically a logistic regression (LR) model. All model training and testing was implemented in the Python programming language using the following libraries: the Scikit- learn open-source software library was used for retrieving trainable models, the CatBoost and extreme Gradient Boosting (XGBoost) open-source software libraries were used for providing gradient boosting trainable models.
[0185] As mentioned above, each of the selected trainable models was trained using the training set corresponding to 80% of the data and evaluated with a tuning set corresponding to 20% of the remaining data, wherein data is stratified to retain the prevalence of positive and negative labels. To evaluate and compare the different types of trainable models the training set further underwent a 10-fold nested cross- validation (as described above), wherein the data was also stratified to retain the prevalence of positive and negative labels across the folds. The best-performing hyperparameters are then used to train the model on the entire training set of the outer loop. For hyperparameter tuning within the inner loop, random search cross-validation (CV) was used to efficiently explore the hyperparameter space for each of the machine learning models. A random search randomly samples a defined number of combinations from the specified hyperparameter distributions. Alternatively, grid search can be used for hyperparameter tuning.
[0186] The performance of the different models for each endpoint separately was evaluated on the training data, and compared to the benchmark standard of care clinical scores. GRACE 2.0 score was used as a benchmark for evaluating ML models for predicting the risk for in-hospital mortality after treatment of STEMI and the risk of 6-month mortality after treatment of STEMI. The GRACE 2.0 risk score comprises two risk calculators to independently predict in-hospital (IH) mortality and six-month (6M) mortality. The GRACE 2.0 (IH) and GRACE 2.0 (6M) use the same set of input features, but employ different feature weights to independently predict in-hospital mortality and six-month mortality correspondingly. In the present examples, in the context of evaluating ML models for predicting the risk for in-hospital mortality after treatment of STEMI, references to the GRACE 2.0 score refer to GRACE 2.0 (IH), and in the context of evaluating ML models for predicting the risk of 6-month mortality after treatment of STEMI, references to the GRACE 2.0 score refers to GRACE 2.0 (6M). The LACE index was used as a benchmark for evaluating ML models for predicting the risk of 30-day readmission at hospital after discharge of the subject. For direct comparison with the trained models, the performance (AUC) of the benchmark clinical scores, GRACE 2.0 (IH or 6M) and LACE index, was determined using the same training dataset (TrinetX). First, the performance of the models described herein was evaluated by comparing their Area Under the Receiver Operating Characteristic Curve (AUC, or sometimes referred to as AUC-ROC) on the training data. The ML model with the highest AUC for each of the endpoints was chosen for further training and evaluation. In the examples described herein, the XGBoost or CatBoost model was chosen, depending on the endpoint.
[0187] Subsequently, after selecting the best model for each endpoint, the inventors trained the model using a 10-fold nested cross-validation (k=10). The performance of the models was compared to the standard of care clinical scores: GRACE 2.0 score and LACE index, when applied to the same data (i.e. training dataset).
[0188] Each of the ML models was calibrated and threshold-adjusted to compare it with the standard of care clinical scores on the threshold-adjusted cut-offs. The calibration was done using platt scaling. Platt scaling is a post-processing technique used to convert the raw output scores of a binary classification model into well-calibrated probability estimates. This method is particularly valuable when the outputs of the classifier need to reflect the true likelihood of an instance belonging to a particular class, making it suitable for applications requiring probabilistic decision-making and risk assessment. Platt scaling achieves this by fitting a logistic regression model to the classifier's output scores, using a sigmoid function to map these scores to the [0, 1] probability interval. The threshold adjustment was done by reproducing the resulting proportions from the standard of care clinical scores GRACE 2.0 and LACE for low, medium and high risk to the trained distinct ML models. Described in more detail below.
[0189] The performance of the trained models to benchmark score was compared based on the following metrics: AUC, AUPRC, F1 -score, sensitivity (sens), specificity (spec), positive predictive value (ppv), negative predictive value (npv). The Precision-Recall Curve (PRC) shows the tradeoff between precision and sensitivity (recall) for different thresholds. The area under this curve (AUPRC) gives a single scalar value ranging between 0.0 and 1 .0, where a value less than 0.5 indicates that the model performs worse than random classifier, but 1 .0 is an ideal AUPRC value representing a perfect model that correctly classifies all positive instances without any errors. Precision is the proportion of true positive (TP) results (correctly predicted positive cases) versus all positive predictions (true positive (TP) and false positive (FP)) made by the model. Precision = TP / (TP+FP). Sensitivity (sens), also referred to as recall, true positive rate, is a measure of a model's ability to correctly identify positive cases. Sensitivity is the proportion of true positive (TP) results (correctly predicted positive cases) versus all the actual positive cases (true positives (TP) and false negatives (FN)). Sensitivity = TP / (TP+FN). F1-score is the harmonic mean of precision and sensitivity. F1 =(2*precision*sensitivity) / (precision+sensitivity). Specificity / true negative rate (spec) is a measure of a model's ability to correctly identify negative cases. Specificity is the proportion of true negative (TN) results (correctly predicted negative cases) versus all the actual negative cases (true negatives (TN) and false positives (FP)). Specificity = TN / (TN+FP). Positive predictive value (ppv) is the proportion of true positive (TP) results (correctly predicted positive cases) versus all the cases that the model has predicted as positive (true positives (TP) and false positives (FP)). ppv= TP / (TP+FP). Negative predictive value (npv) is the proportion of true negative (TN) results (correctly predicted negative cases) versus all the cases that the model has predicted as negative (true negatives (TN) and false negatives (FN)). npv = TN / (TN+FN).
[0190] For further evaluation of the model performance, the inventors performed a calibration analysis using the Brier score and a feature ablation analysis by increasing the percentages of missing data. A calibration plot (Fig. 12), or reliability diagram, is a visual tool used to assess how well a classification model's predicted probabilities align with actual class frequencies. The plot features a reliability curve that illustrates the relationship between predicted probabilities and observed outcomes. The x-axis represents the predicted probabilities, while the y-axis shows the true probability in each bin, which is the observed frequency of the positive class within that bin. A diagonal line on the plot indicates perfect calibration, where the predicted probabilities match the true probabilities exactly. If the reliability curve deviates below the diagonal, it suggests the model is overconfident, predicting higher probabilities than what is actually observed. Conversely, if the curve is above the diagonal, the model is underconfident, predicting lower probabilities than observed. Alongside the calibration plot, the Brier score is a measure used to assess the accuracy of probabilistic predictions. It is defined as the mean squared difference between the predicted probability assigned to the positive outcome and the actual outcome. The value of Brier score can range from 0 to 1 , wherein 0 indicates perfect accuracy of a model and that the predicted probabilities perfectly match with the actual outcomes, and 1 indicates a poor performing model and that all predictions are incorrect. Additionally, a whisker bar plot (Fig. 12, panels below the calibration plot) is used to assess the concentration and dispersion of the model's predictions and to indicate the presence of any outliers. A narrow box suggests that the predicted probabilities are tightly clustered, indicating consistent predictions, while a wider box implies more variability. The position of the median line within the box can indicate skewness; a median closer to the bottom suggests a tendency toward lower probability predictions, while a median near the top indicates higher probability predictions are more frequent. The length of the whiskers reveals the spread of the data beyond the interquartile range, and the presence of outliers highlights predictions that deviate significantly from the majority. The calibration curves and Brier scores of the trained models and the standard of care clinical scores GRACE 2.0 or LACE were compared.
[0191] The feature ablation analysis is used to evaluate the model's ability to maintain consistent predictive performance despite variations and / or changes in the input data, specifically evaluating the models performance in scenarios where certain features or feature subgroups are missing or unobtainable. Features and feature subgroups analyzed for in-hospital mortality endpoint: age, comorbidities (cerebrovascular disease, cardiogenic shock, atrial arrhythmia, congestive heart failure, cardiac arrest), vital signs (systolic BP, 02 saturation), serum albumin levels, complete blood count (platelet, white blood cell (WBC)), enzymatic assay (serum creatinine level, Blood Urea Nitrogen (BUN) level). Features and feature subgroups analyzed for 6-month mortality endpoint: age, comorbidities (rales and / or jugular venous distention (JVD), cerebrovascular disease, cardiogenic shock, atrial arrhythmia, congestive heart failure, cardiac arrest), systolic BP, complete blood count (platelet, white blood cell (WBC), hemoglobin level), enzymatic assay (serum creatinine level, Blood Urea Nitrogen (BUN) level), serum albumin levels. Features and feature subgroups analyzed for 30-day readmission endpoint: emergency department (ED) visits last 6 months, serum albumin levels, hemoglobin levels, sodium levels, enzymatic assay (serum creatinine level, Blood Urea Nitrogen (BUN) level, high-density lipoprotein (HDL) level). The feature ablation analysis was done by incrementally increasing the missingness of a feature or feature subgroup from 0% to 100% by imputing its value, while keeping other features unchanged. Specifically, random selections of patient data for a feature (corresponding to the indicated % of the data for the feature) were replaced by the respective imputed value (computed during the model training and stored in the model) for each feature (basically turning the feature of interest to a constant incrementally).
[0192] 1 .1.4. Determining the thresholds
[0193] A typical output of a tree-based ML model or logistic regression model is a probabilistic output ranging from 0 to 1 , which indicates the probability that a patient will experience a complication (i.e. probability that a patient belongs in a positive class that experiences the complication rather than in a negative class that does not experience the complication). Thresholds can be selected which define the cutoff values at which the predicted probabilistic output is classified into different categories. The threshold values can vary between 0 and 1 . The ML classifier in these examples were used to classify patients in three groups of low, medium and high probability of risk of post-treatment complications. To classify the subjects into the three categories, two cutoff values were determined separately for each of the endpoints: in-hospital mortality after treatment of STEMI, 6-month mortality after treatment of STEMI, and 30-day readmission at hospital after discharge of the patient. A first cutoff value (also referred to as “rule out”) separates patients at low risk of the complication from other patients. A second cutoff value (also referred to as “rule in”) separates patients at high risk of the complication from other patients. Patients with predicted probabilistic output value that is above the first cutoff value (“rule-out”) and below the second cutoff value (“rule-in”) are classified as being at medium risk of the respective post-treatment complication.
[0194] In the example described herein, the thresholds were derived from the applied benchmark’s resulting proportions and validated by the applied benchmark’s metrics. This was done according to the following steps: (i) identify the proportions of subject distribution for low, medium and high categories according to the applied benchmark’s results (Fig. 9A); (ii) identify the metrics at the low, medium and high categories from the applied benchmark’s results (Fig. 9B); (iii) determine the thresholds that replicate the applied benchmark’s proportions with the model trained on the training data set; (iv) the metrics were benchmarked by comparing the threshold based metrics on the training data to the applied benchmark’s resulting metrics.
[0195] The applied benchmark, GRACE for In-hospital mortality (GRACE 2.0 (I H)) , was used as an applied result reference for the risk of in-hospital mortality after treatment of STEMI. The proportions corresponding to low, medium and high risk categories after applying the benchmark GRACE were: 15.63%, 32.02% and 52.36%. The threshold values, which replicates the applied proportions with the model trained on the training data set as described above, when the prevalence is matched, were: 0.023 for rule out and 0.052 for rule in with prevalence of 12.13. Comparison of the performance of the trained model to the benchmark GRACE model are shown in Tables 10 and 11. The resulting thresholds for in-hospital mortality after treatment of STEMI for categorization in low, medium and high groups are as follows: low risk < 0.023, medium risk 0.023-0.052, and high risk > 0.052.
[0196] Table 10. Validating the threshold that replicates the proportions of the applied benchmark (In-hospital morality rule out). Comparison to applied benchmark metrics
[0197] Table 11. Validating the threshold that replicates the proportions of applied benchmark (In-hospital morality rule in). Comparison to applied benchmark metrics
[0198] The applied benchmark, GRACE for 6-month mortality (GRACE 2.0 (6M)), was used as an applied result reference for the risk of 6-month mortality after treatment of STEMI. The proportions corresponding to low, medium and high risk categories were: 16.55%, 28.28% and 55.17%. The threshold values, which replicates the published proportions with the model trained on the training data set as described above, when the prevalence is matched, were: 0.049 for rule out and 0.132 for rule in with prevalence of 22.84. Comparison of the performance of the trained model to the benchmark GRACE model are shown in Tables 12 and 13. The resulting thresholds for 6-month mortality after treatment of STEMI for categorization in low, medium and high groups are as follows: low risk < 0.049, medium risk 0.049-0.132, and high risk > 0.132.
[0199] Table 12. Validating the threshold that replicates the proportions of the applied benchmark (6-month morality rule out). Comparison to applied benchmark metrics
[0200] Table 13. Validating the threshold that replicates the proportions of the applied benchmark (6-month morality rule in). Comparison to applied benchmark metrics
[0201] The applied benchmark, LACE, was used as an applied result reference for the risk of 30-day readmission at hospital after discharge of the patient. The proportions corresponding to low, medium and high risk categories are: 10.94%, 60.09% and 28.97%. The threshold values, which replicates the published proportions with the model trained on the training data set as described above, when the prevalence is matched, were: 0.18 for rule out and 0.31 for rule in with prevalence of 28.10. Comparison of the performance of the trained model to the benchmark LACE model are shown in Tables 14 and 15. The resulting thresholds for 30-day readmission at hospital after discharge of the STEMI patient for categorization in low, medium and high groups are as follows: low risk < 0.18, Medium risk 0.18-0.31 , and high risk > 0.31 .
[0202] Table 14. Validating the threshold that replicates the proportions of the applied benchmark (30-day readmission rule out). Comparison to applied benchmark metrics
[0203] Table 15. Validating the threshold that replicates the proportions of the applied benchmark (30-day readmission rule in). Comparison to applied benchmark metrics
[0204] 1.2. Results
[0205] A set of ML models were trained to predict the risk of in-hospital mortality after treatment of STEMI, 6- month mortality after treatment of STEMI , and the risk of 30-day readmission at hospital after discharge of the patient. The following section summarizes the evaluation results of the ML models trained for each endpoint. The results are presented in the following sequence: (I) Comparison of the calibrated and threshold-adjusted model performance, specifically AUC, of the different types of ML models compared to benchmark standard of care clinical score, specifically GRACE 2.0 or LACE index (Fig. 11); (II) the best performing model is further discussed in detail: wherein (i) the optimized hyperparameters and the final feature list SHAP values (Fig. 9) are shown, (ii) comparison between the best performing ML model and benchmark standard of care clinical score, specifically GRACE 2.0 or LACE index (Fig. 11) is provided and (iii) model evaluation based on Brier score (Fig. 12) and (iv) feature ablation analysis (impact on missingness of features) (Fig. 13) is shown.
[0206] 1 .2.1 . Evaluation of trained ML models for predicting risk of in-hospital mortality after treatment of STEMI
[0207] The optimized hyperparameters of the trained ML models, XGBoost model, CatBoost model, RF model and LR model, to predict the risk of in-hospital mortality after treatment of STEMI are shown in Table 16. The comparison of the model performance (in terms of AUC value), of the different types of the trained ML models compared to the benchmark standard of care clinical score GRACE 2.0 are shown in Fig. 11 A. The Gradient boosting algorithms (XGboost (AUC=0.83) or CatBoost (AUC=0.82)) and the Random Forest model (AUC=0.82) performed similarly well and achieved performance results of AUC noticeably higher than the benchmark score GRACE 2.0 (AUC = 0.74). Logistic regression classifier performed similarly well (AUC=0.80). Finally, based on the highest AUC value, the XGBoost model was chosen as the best performing model for further evaluation. Table 16. Optimized hyperparameters of trained ML models to predict risk of in-hospital mortality after treatment of STEM I.
[0208] The SHAP values of the final feature list of the trained XGBoost model to predict the risk of in-hospital mortality after treatment of STEMI are shown in Fig. 9A. The SHAP values (mean absolute across all instances) indicate that the most influential features for predicting in-hospital outcomes are cardiogenic shock (0.3456) and cardiac arrest (0.2528). These severe conditions significantly impact the model's predictions. Other critical predictors include the last recorded white blood cell count (WBC) (0.2129) and serum albumin levels (0.21 13). The patient's age (0.1888) and the presence of congestive heart failure (0.1743) are also substantial factors. Additionally, oxygen saturation (median of the last measurements) (0.1738) and serum creatinine levels (0.1704) contribute notably to the model's decision-making process. Blood urea nitrogen (BUN, median of the last measurements) (0.1483) and atrial arrhythmia (0.1316) are significant as well. Systolic blood pressure (median of the last measurements) (0.1245), the presence of cerebrovascular disease (0.1223), and platelet count (median of the last measurements) (0.1144) also play important roles. These features collectively contribute to the model's predictions, with conditions such as congestive heart failure, age, serum albumin levels, specific clinical signs, and blood markers playing substantial roles in the decision-making process.
[0209] The comparison between the calibrated and threshold-adjusted XGBoost model as the best performing ML model to predict the risk of in-hospital mortality after treatment of STEMI (in Table 17 indicated as disclosure) and the benchmark standard of care clinical score GRACE 2.0 is provided in Table 17 and Fig. 11 A. The in-hospital mortality model is clearly outperforming GRACE, as highlighted in bold, both for rule- out and rule-in.
[0210] Table 17. The performance metrics for rule-in and rule-out of XGBoost model for predicting the risk of in- hospital mortality after treatment of STEMI and GRACE 2.0 score. sens=sensitivity; spec=specificity.
[0211] The Brier score comparison between the XGBoost model trained to predict the risk of in-hospital mortality after treatment of STEMI and the benchmark standard of care clinical score GRACE 2.0 is shown in Fig. 12A. In our comparison, the XGBoost classifier achieved a Brier score of 0.09, whereas the benchmark model GRACE 2.0 had a Brier score of 0.10. This indicates that the XGBoost classifier is better calibrated than the GRACE model, as it has a lower Brier score. The difference in Brier scores suggests that the XGBoost classifier's probabilistic predictions are more accurate and reliable, making it a superior choice for the risk prediction of in-hospital mortality.
[0212] To investigate the impact of feature missingness on the models performance feature ablation analysis were conducted, specifically examining scenarios where certain features are missing or unobtainable. The feature ablation analysis for rule-in (Fig. 13A) and rule-out (Fig. 13B) of the XGBoost model trained to predict the risk of in-hospital mortality after treatment of STEMI are shown in Fig. 13. For each feature, its missingness was incrementally increased from 0% to 100% (as indicated on the x-axis in Fig. 13) by imputing its value, while keeping other features unchanged. Upon feature ablation analysis, the inventors observed that comorbidities play a crucial role in improving performance of the model. Notably, removing comorbidities resulted in a considerable drop in both the AUC) and the Area Under the Precision-Recall Curve (AUPRC), underscoring their importance. Thus, models that include a plurality of clinical history features including a feature indicative of whether the patient has suffered cardiac arrest, and a feature indicative of whether the patient has suffered cardiogenic shock (both of which were shown to be particularly important according to the results in Fig. 8A and Fig. 9A), and at least one further feature selected from: a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered atrial arrhythmia, a feature indicative of whether the patient has suffered cerebrovascular disease (all of which are comorbidities features that were selected in the final model) are expected to perform well. Additionally, features such as age (which was shown to contribute to improved performance on Fig. 8A), vital signs (systolic blood pressure, 02 saturation - both of which were shown to be relevant vital signs features on Fig. 8A) as well as blood count (platelet, white blood count - both of which were shown to be relevant laboratory test features on Fig. 8A) caused a slight decrease in model performance when their missingness increased. In contrast, enzymatic assays (creatinine) and albumin had a lesser impact on model performance under similar conditions, and therefore models without these features may still perform competitively.
[0213] These observations were consistent in both rule-in and rule-out cases, highlighting the varying degrees of importance among different feature categories.
[0214] 1 .2.2. Evaluation of trained ML models for predicting risk of 6-month mortality after treatment of STEMI
[0215] The optimized hyperparameters of the trained models ML models, XGBoost model, CatBoost model, RF model and LR model, to predict the risk of 6-month mortality after treatment of STEMI are shown in Table 18. The comparison of the model performance AUC value of the different types of trained ML models (CatBoost, XGBoost, RF and LR) to predict the risk of 6-month mortality after treatment of STEMI compared to the benchmark standard of care clinical score GRACE 2.0 is shown in Fig. 11B. The performance results for the prediction of the risk of 6-month mortality are in line with the results obtained for the prediction of the risk of in-hospital mortality models. Machine learning based predictive models are noticeably better performing than GRACE 2.0. Gradient boosting (XGboost or CatBoost), RFand LR classifiers performed similarly well (AUC = 0.79-0.80). Finally, based on the highest AUC value, the XGBoost model was chosen as the best performing model for further evaluation.
[0216] Table 18. Optimized hyperparameters of trained ML models to predict risk of 6-month mortality after treatment of STEMI.
[0217] The SHAP values of the final feature list of the trained XGBoost model to predict the risk of 6-month mortality after treatment of STEMI are shown in Fig. 9B. For the 6-month mortality prediction model, the SHAP values (mean absolute across all instances) for the most influential features are congestive heart failure (0.2749), age (0.1972), and serum albumin (median of the last measurements) (0.1926). The presence of rales and / or jugular venous distension (JVD) (0.1755), atrial arrhythmia (0.1754), and hemoglobin (median of the last measurements) (0.1753) are also highly significant. Blood urea nitrogen (BUN, median of the last measurements) (0.1743), cardiac arrest (0.1678), and cardiogenic shock (0.1527) are critical factors influencing the model's predictions. Additionally, the white blood cell count (median of the last measurements) (0.1359), serum creatinine (median of the last measurements) (0.1145), and presence of cerebrovascular disease (0.1 131) have notable impacts. Systolic blood pressure (median of the last measurements) (0.1 104) and platelet count (median of the last measurements) (0.0774) are also considered by the model. These features collectively contribute to the model's predictions of 6-month mortality, with conditions such as congestive heart failure, age, serum albumin levels, specific clinical signs, and blood markers playing substantial roles in the decision-making process. These features collectively contribute to the model's prediction of 6-month mortality after treatment of STEMI.
[0218] The comparison between the calibrated and threshold-adjusted XGBoost model to predict the risk of 6- month mortality after treatment of STEMI and the benchmark standard of care clinical score GRACE 2.0 is provided in Table 19 (the XGBoost model labeled as Disclosure) and Fig. 11B. The 6-month mortality risk model (XGBoost classifier) is clearly outperforming GRACE in any metrics, as highlighted in bold, both for rule-out and rule-in.
[0219] Table 19. - The performance metrics for rule-in and rule-out of XGBoost model for predicting the risk of 6- month mortality after treatment of STEMI and GRACE 2.0 score. sens=sensitivity; spec=specificity
[0220] The Brier score comparison between the XGBoost model trained to predict the risk of 6-month mortality after treatment of STEMI and the benchmark standard of care clinical score GRACE 2.0 is shown in Fig. 12B. For the 6-month mortality risk prediction the XGBoost classifier achieved a Brier score of 0.13, while the benchmark model GRACE 2.0 had a Brier score of 0.16. Although the XGBoost classifier has a slightly lower Brier score, the difference is minimal. This suggests that both models are similarly calibrated given our data, with only a marginal advantage for the XGBoost classifier. Therefore, both models provide comparable levels of accuracy and reliability in their probabilistic predictions. To investigate the impact of feature missingness on the performance of the models, feature ablation analysis were conducted, specifically examining scenarios where certain features are missing or unobtainable. The feature ablation analysis for rule-in and rule-out of the XGBoost model trained to predict the risk of 6-month mortality after treatment of STEMI are shown in Fig. 13C (rule-in), Fig. 13D (rule-out).
[0221] The feature ablation analysis for the 6-month mortality risk model yielded results similar to those observed in the in-hospital mortality robustness check. Comorbidities were found to play a crucial role in the model's performance, while other feature categories were considerably less impactful. Specifically, removing comorbidities resulted in a significant decline in both the AUC and the Area Under the Precision-Recall Curve (AUPRC), underscoring their importance. Therefore, a model including a plurality of clinical history features comprising a feature indicative of whether the patient has suffered congestive heart failure, and a feature indicative of whether the patient has suffered cardiac arrest both of which were found to be important in Fig. 8B and Fig. 9B) and one or more further clinical history features selected from: a feature indicative of whether the patient has suffered cerebrovascular disease, a feature indicative of whether the patient has suffered rales and / or jugular venous distension (JVD), a feature indicative of whether the patient has suffered atrial arrhythmia, and a feature indicative of whether the patient has suffered cardiogenic shock; are expected to perform particularly well. Additionally, blood counts (platelet, hemoglobin, white blood count) were found to cause a slight decrease. Therefore, a model comprising one or more laboratory test features selected from: a feature indicative of the patient’s white blood cell count, a feature indicative of the patient’s platelet count, and a feature indicative of patient hemoglobin level, e.g. in addition to a feature indicative of whether the patient has suffered congestive heart failure, and a feature indicative of whether the patient has suffered cardiac arrest is expected to perform particularly well. All other features such as age, vital signs (systolic blood pressure), albumin and creatinine had less impact on model performance when their missingness increased.
[0222] These observations were consistent in both rule-in and rule-out cases, highlighting the varying degrees of importance among different feature categories for the 6-month mortality risk model.
[0223] 1.2.3. Evaluation of trained ML models for predicting risk of 30-days readmission at hospital after discharge of the patient.
[0224] The optimized hyperparameters of the trained models ML models, XGBoost model, CatBoost model, RF model and LR model, to predict the risk of 30-day readmission at hospital after discharge of the patient are shown in Table 20. The comparison ofthe model performance AUC value of the different types of calibrated and threshold-adjusted trained ML models (CatBoost, XGBoost, RF and LR) to predict the risk of 30-day readmission at hospital after discharge of the patient compared to the benchmark standard of care clinical score LACE index is shown in Fig. 11C. GB (CatBoost, XGboost), RF and LR models performed similarly well (AUC=0.64 and AUC=0.63) and noticeably higher than the benchmark LACE index (AUC=0.58). The CatBoost model showed a slightly higher AUC (AUC=0.64) than other ML models, and, subsequently, was chosen for further evaluation.
[0225] Table 20. Optimized hyperparameters of trained ML models to predict the risk of 30-day readmission at hospital after discharge of the patient
[0226] The SHAP values of the final feature list of the trained CatBoost model to predict the risk of 30-day readmission at hospital after discharge of the patient are shown in Fig. 9C. The SHAP values (mean absolute across all instances) indicate that the most influential features for predicting 30-day readmission are the number of emergency department (ED) visits in the last 6 months (0.1295) and hemoglobin (median of the last measurements) (0.0500). Blood urea nitrogen (BUN, median of the last measurements) (0.0241) and serum creatinine (median of the last measurements) (0.0171) are also significant predictors. Additionally, high-density lipoprotein (HDL, median of the last measurements) (0.0165), serum albumin (median of the last measurements) (0.0165), and sodium (median of the last measurements) (0.0065) have notable impacts on the model's predictions. These features collectively contribute to the model's predictions of 30-day readmission, with the number of ED visits being the most influential, followed by various blood markers including hemoglobin, BUN, serum creatinine, HDL, serum albumin, and sodium levels.
[0227] The comparison between the calibrated and threshold-adjusted CatBoost model to predict the risk of 30- day readmission at hospital after discharge of the patient and the benchmark standard of care clinical score LACE index is provided in Table 21 and Fig. 11C. As with the mortality risk models, the 30-day readmission risk model (CatBoost classifier) is clearly outperforming GRACE in any metrics, as highlighted in bold, both for rule-out and rule-in. Table 21. The performance metrics for rule-in and rule-out of CatBoost model for predicting the risk of 30- day readmission at hospital after discharge of the patient and LACE index. sens=sensitivity, spec=specificity, ppv=positive predictive value, npv=negative predictive value.
[0228] The Brier score comparison between the CatBoost model trained to predict the risk of 30-day readmission at hospital after discharge of the patient and the benchmark standard of care clinical score LACE index is shown in Fig. 12C. For the 30-day readmission endpoint prediction the CatBoost classifier achieved a Brier score of 0.19, while the benchmark model LACE index had a Brier score of 0.23. This suggests that both models are similarly calibrated given the training data, with the CatBoost classifier having a slight edge. Therefore, both models provide relatively comparable levels of accuracy and reliability in their probabilistic predictions.
[0229] To investigate the impact of feature missingness on the performance of the models, feature ablation analyses were conducted, specifically examining scenarios where certain features are missing or unobtainable. The feature ablation analysis for rule-in and rule-out of the CatBoost model trained to predict the risk of 30-day readmission at hospital after discharge of the patient are shown in Fig. 13E (rule-in), Fig. 13F (rule-out). For each feature, its missingness was incrementally increased from 0% to 100% (as indicated on the x-axis in Fig. 13) by imputing its value, while keeping other features unchanged. The feature ablation analysis of the 30-day readmission model identified the number of emergency visits in the past 6 months (ED_visits_6_months, top feature) as the most significant feature. Removing these features resulted in a significant decline in both the AUC and the AUPRC, confirming their critical importance. This finding aligns with the SHAP value plot, which also highlighted these features. Notably, these variables are required input parameters for the benchmark model LACE index, underscoring their overall relevance in predicting readmissions. Additionally, the analysis revealed that laboratory tests play an important role in the model's performance. Therefore, a model using features comprising one or more hospital admission history features comprising a feature indicative of a number of emergency department visits made by the patient in a predetermined period of time prior to the predicting, and a plurality of laboratory test features including a feature indicative of the patient’s hemoglobin level (which was shown to contribute to predictions on Fig. 8C) are expected to perform well. In contrast, other categories of features were found to be considerably less impactful when removed.
[0230] 1.3. Conclusions
[0231] The described study presents improved methods for predicting risk for post-treatment complications in STEMI patients. Specifically, the distinct set of ML models for predicting the risk of in-hospital mortality after treatment of STEMI, 6-month mortality after treatment of STEMI and 30-day readmission at hospital after discharge of the patient show improved evaluation metrics, such as AUC, Brier score compared to benchmark standard of care clinical scores GRACE 2.0 and LACE index.
[0232] The presented ML models incorporate predictive features, which are currently not used in the standard of care clinical scores GRACE 2.0 and LACE index. Specifically, the presented ML models use alternative sets of features including comorbidities, lab values and vital signs which are not used in prior art scores and were not previously identified as being important to make such predictions. The improved performance of prediction of post-treatment complication risks of the proposed ML models is attributed to the addition of these new features. Specifically for the model for predicting 30-day readmission at hospital after discharge, the addition of features indicative of laboratory test are not currently used in the standard of care clinical score LACE index and result in a highly improved prediction performance.
[0233] Example 2
[0234] The present example describes work that builds upon and expands on the work in Example 1 by training models for predicting the same post-treatment complications using a different, more general cohort of patients.
[0235] 2.1. Materials and methods
[0236] 2.1 .1 . Data and features
[0237] Training data. For the development of the machine learning algorithms in the present example, the inventors utilized datasets from Premier Healthcare Solutions, specifically the PREMIER PHD AC database (premierinc.com), which includes de-identified information from networks, healthcare organizations (HCOs), and other data providers across North America. The dataset specifically comprised data for patients collected between 2017 and 2023. Patients were included in the training dataset if they met all of the following criteria: (i) diagnosis of ST-segment elevation myocardial infarction (STEMI) defined by any one or more of the following ICD-10 codes (121 .01 , 121.02, 121.09, 121 .11 , 121.19, 121 .21 , 121 .29, 121.3, I22.0, 122.1 , I22.8, I22.9); (ii) the STEMI diagnosis of the patient was recorded during an inpatient or emergency hospitalization; (iii) the admission date for the STEMI hospitalization was before January 30, 2020; and (iv) the patient must have a minimum period of follow-up data available, which varied by the analysis outcome (In-hospital Mortality: no follow-up period required; 30-day Readmission / Death: at least 30 days of follow-up; 6-month Mortality: at least 180 days of follow-up). To identify features predictive of post-treatment complications in STEMI patients, the inventors extracted data from the database including demographics, clinical history and comorbidities, procedures, vital signs and laboratory tests, and patient hospital admission history / status. For the in-hospital mortality model, N = 29,991 subjects (80.0% Train (29991) / 20.0% Tune (7498)) were used for ML model training purposes. For the 6-months mortality model, N = 1 1519 subjects (80.0% Train (11519) / 20.0% Tune (2879)) were used for ML model training purposes. For the 30-day readmission model, N = 13653 subjects (80.0% Train (13653) / 20.0% Tune (3414)) were used for ML model training purposes. The number of patients included in the training cohort for each model varied based on the specific follow-up time required by its endpoint. Endpoints requiring longer follow-up (e.g., 6-month mortality) necessitated the exclusion of patients for whom a complete follow-up status was unknown, in order to ensure that "no event" was a true negative outcome and not a case of missing data. The prevalence of patients corresponding to the three different post-treatment complication endpoints in the data set are: 8% of in-hospital mortality, 24% of 6-month mortality, and 19% of 30-day readmission. The laboratory test features were represented using LOINC in the PREMIER PHD AC dataset, therefore LOINC codes were used to extract these values. The composition of the cohort of the training data obtained from the Premier Healthcare Solutions dataset, specifically the balanced patient gender distribution, a wide patient age range, varied regional locations, and the patient ethnic distribution, are shown in Fig. 14. Fig.14A shows the cohort composition for the in-hospital outcome, Fig. 14B - the 6-month mortality outcome and Fig. 14C the 30-day readmission outcome.
[0238] Label definition. The definition of the labels for each of the outcomes was as described in the Example 1 section 1 .1 . subsection “Label definition”.
[0239] Data splitting. The prevalence of patients with each of the three different post-treatment complication endpoints in the data set was: 8% of in-hospital mortality, 24% of 6-month mortality, and 19% of 30-day readmission. Before training each of the distinct ML models for each of the defined endpoints, the data was split into a training set and a tuning set with a 80 / 20 ratio every time (Fig. 15). To maintain the prevalence of the positive and negative labels for each of the endpoints in both the training and the tuning sets, the patients were stratified during the split (Tables 2-4, as also shown above in Example 1). Each of the distinct ML models was trained using a training set corresponding to 80% of the data and evaluated with a tuning set corresponding to 20% of the data.
[0240] The training set further underwent repeated A-fold cross-validation (CV). The A-fold CV is used for model evaluation using AUC and the training dataset is divided in k equally sized folds, where the model is trained iteratively on k-1 of the folds and tested on the remaining fold of the dataset. The “k" represents a number of equally sized subsets / folds of the dataset. In this example, for the feature ranking work the inventors used 10-fold nested cross validation (i.e. k=10) that was repeated 10 times with different random splits and model seeds to create a sample size of 100. For the model training the inventors used 10-fold nested cross validation (k=10). AUC (Area Under the Receiver Operating Characteristic Curve, or sometimes referred to as AUC-ROC) is a scalar value representing the overall performance of the classifier and reflects the model's ability to discriminate between positive and negative classes. AUC value can lie between 0.0 and up to 1 .0, where value less than 0.5 indicates that the model performs worse than random guess, and 1 .0 is an ideal AUC value representing a perfect model that correctly classifies all positive and negative instances without any errors.
[0241] Handling of missing data. The handling of missing values is done the same as in Example 1 and described in detail above in Example 1 section 1.1., subsection “Handling of missing data”.
[0242] 2.1 .2. Feature selection process
[0243] In the present example, as was done in Example 1 , the features were selected based on their predictive power for each of the endpoints separately - in-hospital mortality after treatment of STEMI, 6-month mortality after treatment of STEMI, and the risk of 30-day readmission at hospital after discharge of the patient. Feature selection was performed in two stages: (A) initial model training for each model separately using 10-fold nested cross-validation and SHAP (SHapley Additive exPlanations) value based feature ranking (Fig. 16, panel A), (B) iterative feature selection with stop criteria to identify the most relevant features (Fig. 16, panel B). Additionally, clinical experts were consulted on the selected feature list to ensure that the features are sensible and are also in line with clinical practice of data collection. The feature selection method is outlined in more detail below.
[0244] A. Initial Model Training and SHAP value based Feature Ranking
[0245] The first stage of the feature selection process comprises the initial model training and SHAP value based feature ranking. Four different models were used, Catboost, XGBoost, RandomForest and LogisticRegression. The two stages, A and B, of the feature selection were applied separately for each of the model types. The SHAP values represent how much each feature contributes to the predicted value of the target, taking into account all other features in the same instance.
[0246] The first stage of the feature selection process included the following steps: (i) training the model (each of the model types separately - Catboost, XGBoost, RandomForest or LogisticRegression) using repeated 10-fold nested cross-validation on all of the features; (ii) computation of SHAP values for all of the features using the SHAP package; (iii) ranking the (X) number of features based on the average absolute SHAP values.
[0247] A repeated 10-fold nested cross-validation consists of two-levels of cross-validation: an outer loop and inner loop, and the entire process is repeated 10 times with different random splits and model seeds. The outer loop is used for model evaluation based on the highest AUC value and the training dataset is divided in 10 equally sized folds, where the model is trained iteratively on 9 of the folds and tested on the remaining fold of the dataset. The inner loop is used for hyperparameter tuning and with each iteration of the outer loop a nested (inner) cross-validation is performed on the subset of 9 folds of the training data, wherein for each combination of hyperparameters the data is split into 10 folds and model is trained on 9 folds and tested on the remaining fold. The best-performing hyperparameters are then used to train the model on the entire training set of the outer loop.
[0248] B. Iterative Feature Selection with stop criteria
[0249] The second stage of the feature selection process comprised determining the optimal number of features for each model by evaluating the model's performance (AUC value). The second stage of the feature selection process includes the following steps:
[0250] (i) retrieving the (X) features ranked based on the average absolute SHAP values of the first stage described above; iteratively repeating the following steps (ii)-(iv), starting from the top feature (K=1 ) of the ranked list: (ii) training the model on the feature set ([K]), comprising the (K) top features of the (X) features, using the training set and 10-fold nested cross-validation, repeated 10 times with different random starting seeds for the model and the data splits; (iii) storing the mean AUC; (iv) adding a feature (K+1) to the feature set ([K, K+1 ]), comprising the (K+1 ) top features of the (X) features; the steps (ii)-(iv) are repeated until the last iteration where the improvement in the AUC from the previous step exceeds a threshold, r (r is chosen to be the smallest effect size that can be sufficiently differentiate from noise whilst being lower than what is clinically meaningful, in other words, the smallest effect size (r) that can be detected with a sample size (N) of 100 and power (P) > 80% while using the Wilcoxon signed rank test), or all (X) features are included in the feature set (Fig. 17); and
[0251] (v) retrieving the top final features for the final model.
[0252] The trained LR model for each of the endpoints was chosen as a superior model because it offers greater simplicity and performs equally well with fewer predictive features (discussed in more detail in Example 2 Result section). The selection of the final feature list for a LR model will be described in more detail below. Fig. 17 illustrates the AUC curve for feature forward selection for each of the endpoints. The final feature sets for the trained LR models for each of the endpoints, determined using the method described herein, are listed below (Tables 22-24). The SHAP values of the final feature set are shown in Fig. 18.
[0253] Fig. 17 shows results of the forward feature selection for prediction of each of the endpoints (Fig. 17A - in-hospital mortality, Fig. 17B - 6-month mortality, Fig.17C - 30-day readmission) described in Example 2. The plot shows the mean AUC over a plurality of cross-validation folds, the 95% confidence interval around the mean and analytical overlays (Last Maximum AUC Mean, p-value curve, set of Power Curves, Plateau Point). In Figs. 17A-C, the "Last Maximum AUC Mean" curve tracks the highest mean AUC obtained in any preceding iteration of the feed-forward feature selection, acting as a high-water mark. This benchmark value is compared against the mean AUC of the current feature set; if the current mean AUC is higher, it replaces the last maximum AUC mean value and is used for comparison in the next iteration. The "p-value curve" indicates the calculated p-value for each iteration. This p-value is derived using Wilcoxon signed-rank test from the statistical difference in AUC means between samples (comparing the current iteration's AUC to the previous one's). The calculated p-value is then compared to a pre-defined significance threshold (here alpha = 0.05) to determine if an observed increase in mean AUC is statistically significant. The set of “Power curves” are used as a verification step, wherein the statistical test's power is confirmed to be above a reliability threshold (here >80%) for detecting a minimum meaningful effect (in the plot indicated as three separate power curves corresponding to minimum meaningful effect values: 0.005, 0.003, 0.002 AUC), thereby validating the trustworthiness of the statistical stopping criteria. The “Plateau Point” is a marker within the feature selection process that identifies the optimal feature set. It is defined as the iteration corresponding to the last performance gain greater than 0.003 AUC, as any improvements below this minimum meaningful threshold are considered statistically indistinguishable from random noise.
[0254] Fig. 17A illustrates the AUC curve generated during the iterative feature selection process for the prediction of in-hospital mortality; as the improvement in AUC no longer exceeded the significance threshold (r) beyond the 7th feature, the inventors selected the top 7 features (Table 22), thereby balancing model effectiveness and efficiency. A trained LR model comprising the first 3 features (a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, and a feature indicative of whether the patient has suffered cardiogenic shock, one or more demographic features comprising a feature indicative of the patient’s age) already achieved higher AUC (AUC= 0.875023) than the benchmark clinical score GRACE 2.0 (IH) (AUC=0.855) (Fig. 17A).
[0255] Fig. 17B illustrates the AUC curve generated during the iterative feature selection process for the prediction of 6-month mortality; as the improvement in AUC no longer exceeded the significance threshold (r) beyond the 8th feature, the inventors selected the top 8 features (Table 23), balancing the models effectiveness and efficiency. A trained LR model comprising the first 3 features (a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, and a feature indicative of whether the patient has suffered cardiogenic shock, one or more demographic features comprising a feature indicative of the patient’s age) already achieved higher AUC (AUC= 0.858104) than the benchmark clinical score GRACE 2.0 (6M) (AUC=0.836) (Fig. 17B).
[0256] Fig. 17C illustrates the AUC curve generated during the iterative feature selection process for the prediction of 30-day readmission; as the improvement in AUC no longer exceeded the significance threshold (r) beyond the 7th feature, the inventors selected the top 7 features (Table 24), balancing the models effectiveness and efficiency. A trained LR model comprising the first 6 features (a plurality of hospital admission history features comprising: a feature indicative of a number of emergency department visits made by the patient in a predetermined period of time prior to the predicting (6 months), a feature indicative of admission duration; one or more of laboratory features comprising: a feature indicative of whether the patient has suffered a feature indicative of the patient’s blood urea nitrogen (BUN) level; a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered from kidney disease (renal disease), a feature indicative of whether the patient has suffered chronic pulmonary disease), and a feature indicative of whether the patient has suffered cardiogenic shock, one or more demographic features comprising a feature indicative of the patient’s age) already achieved higher AUC (AUC= 0.641626) than the benchmark clinical score LACE (AUC=0.635) (Fig. 17C). Table 22. List of the final feature set (7 features) for training a ML (LR) model for predicting risk of in- hospital mortality after treatment of STEMI.
[0257] Table 23. List of the final feature set (8 features) for training a ML (LR) model for predicting risk of 6-month mortality after treatment of STEMI.
[0258] Table 24. List of the final feature set (7 features) for training a ML (LR) model for predicting risk of 30-day readmission at hospital after discharge of the patient. 2.1 .3. Training the model
[0259] Aset of distinct ML models were trained to predict the risk of in-hospital mortality after treatment of STEMI, 6-month mortality after treatment of STEMI, and the risk of 30-day readmission at hospital after discharge of the patient. In the present example, for each of the endpoints the inventors trained the following models: a plurality of decision tree based models, specifically CatBoost model, XGBoost model and a RF model; and a regression model, specifically a logistic regression (LR) model. All model training and testing was implemented in the Python programming language using the following libraries: the Scikit-learn open- source software library was used for retrieving trainable models, the CatBoost and extreme Gradient Boosting (XGBoost) open-source software libraries were used for providing gradient boosting trainable models. The Xgboost models were trained with objective function: binaryJogistic (logistic regression for binary classification, output probability). For XGBoost models the scale_pso_weight parameter (which controls the balance of positive and negative weights, and is useful for unbalanced classes) was set to 11 .866151866151867 for prediction of in-hospital mortality, 3.2147822905232344 for prediction of 6- months mortality, and 4.277541553923464 for prediction of 30-day readmission (corresponding to sum(negative instances) I sum(positive instances). The optimized hyperparameters of the trained ML models, XGBoost model, CatBoost model, RF model and LR model, to predict the risk of in-hospital mortality after treatment of STEMI are shown in Table 29. The optimized hyperparameters of the trained ML models, XGBoost model, CatBoost model, RF model and LR model, to predict the risk of 6-month mortality after treatment of STEMI are shown in Table 32. The optimized hyperparameters of the trained ML models, XGBoost model, CatBoost model, RF model and LR model, to predict the risk of 30-day readmission after treatment of STEMI are shown in Table 35.
[0260] As mentioned above, each of the models was trained using a training set corresponding to 80% of the data and evaluated with a tuning set corresponding to 20% of the remaining data, wherein data is stratified to retain the prevalence of positive and negative labels. To evaluate and compare the different types of models the training set further underwent a 10-fold nested cross-validation (as described above), wherein the data was also stratified to retain the prevalence of positive and negative labels across the folds. The bestperforming hyperparameters were then used to train the model on the entire training set of the outer loop. For hyperparameter tuning within the inner loop, random search cross-validation (CV) was used to efficiently explore the hyperparameter space for each of the machine learning models. A random search randomly samples a defined number of combinations from the specified hyperparameter distributions. Alternatively, grid search can be used for hyperparameter tuning.
[0261] The performance of the different models for each endpoint separately was evaluated on the training data, and compared to the benchmark standard of care clinical scores, in a similar way as described in Example 1 . For direct comparison with the trained models, the performance (AUC) of the benchmark clinical scores, (GRACE 2.0 (IH), GRACE 2.0 (6M) and LACE index, respectively for the in-hospital mortality, 6-month mortality and 30-day readmission models), was determined using the same training dataset (Premier Health Solutions). First, the performance of the models was evaluated by comparing their Area Under the Receiver Operating Characteristic Curve (AUC, or sometimes referred to as AUC-ROC) on the training data. In the present example, Logistic Regression models were chosen for each of the endpoints for further evaluation. The LR models for each of the endpoints were trained using a 10-fold nested cross-validation (k=10). Subsequently the models were compared to the benchmark standard of care clinical scores.
[0262] Each of the ML models was calibrated and threshold-adjusted for comparison with the standard of care clinical scores on the threshold-adjusted cut-offs. Adjustments to the training process to account for class imbalance systematically miscalibrated the raw probability outputs, necessitating a post-processing calibration step. The calibration was done using Platt scaling. Platt scaling is a post-processing technique used to convert the raw output scores of a binary classification model into well-calibrated probability estimates. This method is particularly valuable when the outputs of the classifier need to reflect a true likelihood of an instance belonging to a particular class, making it suitable for applications requiring probabilistic decision-making and risk assessment. Platt scaling achieves this by fitting a logistic regression model to the classifier's output scores, using a sigmoid function to map these scores to the [0, 1] probability interval. The threshold adjustment was done by reproducing the resulting proportions from the standard of care clinical scores GRACE 2.0 and LACE for low, medium and high risk to the trained distinct ML models. The performance of the trained models to benchmark score was compared based on the following metrics: AUC, sensitivity (sens), negative predictive value (npv), specificity (spec), positive predictive value (ppv). Table 29. Optimized hyperparameters and the predictive features of different types of trained ML models to predict risk of in-hospital mortality after treatment of STEM!.
[0263] Table 32. Optimized hyperparameters and the predictive features of different types of trained ML models to predict risk of 6-month mortality after treatment of STEM!.
[0264] Table 35. Optimized hyperparameters and the predictive features of different types of trained ML models to predict risk of 30-day readmission after treatment of STEM!.
[0265] 2.1 .4. Determining the thresholds
[0266] The ML classifiers in the present example were used to classify patients in three groups of low, medium and high probability of risk of post-treatment complications. To classify the subjects into the three categories, two cutoff values were determined separately for each of the endpoints. A first cutoff value (also referred to as “rule out”) separates patients at low risk of the complication from other patients. A second cutoff value (also referred to as “rule in”) separates patients at high risk of the complication from other patients. Patients with predicted probabilistic output value that is above the first cutoff value (“rule-out”) and below the second cutoff value (“rule-in”) were classified as being at medium risk of complications. In the present example, the thresholds were derived to replicate the rule-out sensitivity and rule-in specificity reported for the established benchmarks in the scientific literature. This was done according to the following steps: (I) identify the metrics, sensitivity for rule-out and specificity for rule-in from the relevant scientific literature pertaining to the benchmark scores for each clinical endpoint (Table 25- 27, 28); (ii) determine the thresholds cutoff values (rule-out and rule-in) of the model trained on the training data that replicate the literature-derived benchmark's sensitivity and specificity metrics on the tuning data. The rule-out sensitivity and rule-in specificity thresholds (Table 28) were derived by first calculating the median sensitivity and specificity values from the relevant benchmark studies (Tables 25-27) and then determining more applicable, rounded thresholds.
[0267] Table 25. Sensitivity and specificity reported for GRACE 2.0 (IH) for in-hospital mortality. Table 26. Sensitivity and specificity reported for GRACE 2.0 (6M) for 6-months mortality.
[0268] Table 28. Summary of specified rule-out sensitivity and rule-in specificity thresholds.
[0269] 2.1.5. Exploratory analyses
[0270] Threshold based metrics. The performance of the ML model trained to predict risk for post-treatment complications of STEMI patients, specifically in-hospital mortality, 6-month mortality and 30-day readmission risk (as described in the preceding sections) was evaluated using the following metrics, wherein the calculation of the metrics is as described above in Example 1 : (i) Area Under the Curve (AUC) (Overall Performance): A global measure of a model's ability to distinguish between positive and negative cases across all possible decision thresholds. An AUC of 1 .0 represents a perfect model, while an AUC of 0.5 represents a model with no better-than-random predictive ability, (ii) Sensitivity (rule-out): The proportion of actual positive cases correctly identified by the model. High sensitivity is critical for "rule-out" scenarios, as it ensures few cases are missed, rendering a negative result highly trustworthy, (iii) Negative Predictive Value (NPV, npv): The probability that a patient with a negative test result is truly outcome-free (a true negative), (iv) Specificity (Rule-in): The proportion of actual negative cases correctly identified by the model. High specificity is utilized for "rule-in" scenarios, as it implies that a positive result is highly likely to be a true positive, (v) Positive (Precision) Predictive Value (PPV) (Rule-in): The proportion of positive predictions that represent actual positive cases, (vi) Risk Ratio (Rule-out & Rule-in): A metric comparing the probability of being outcome-free if classified below the threshold versus the probability of being outcome-free if classified above the threshold. A high value indicates that falling below the threshold strongly confirms the absence of the outcome. It is calculated as RR (Rule-out & Rule-in) = NPV / (1 - PPV).
[0271] SHAP values. The relative importance of the predictive features for each trained model was quantified using SHAP (SHapley Additive exPlanations) values. Fig. 18 illustrates the mean absolute SHAP values for the final feature list of the LR model trained to predict an adverse event after treatment of STEMI.
[0272] Calibration evaluation. Calibration of a ML model trained to predict risk for post-treatment complications of STEMI patients (as described in the preceding sections) was evaluated by comparing the Brier score of the disclosed ML model against its respective benchmark (GRACE 2.0 or LACE) applied to the same dataset. To assess statistical significance, a bootstrapping method (1000 iterations) was employed to generate a 95% confidence interval for the difference in Brier scores (the disclosed ML model minus the respective benchmark). The magnitude of the difference (“PREMIER-PERISCOPE” score minus Benchmark score) in Brier scores was calculated for these bootstrap samples, and density of these differences across the bootstrapped samples was used to estimate the confidence interval. A negative value indicates a lower prediction error (better calibration) for the model of the present example.
[0273] Formal superiority testing focused on the "relevant range" between pre-specified rule-out and rule-in thresholds, as this interval is critical for clinical decision-making. The calibration of the disclosed ML model is considered statistically superior if the 95% confidence interval for the difference in Brier scores is entirely negative (i.e., does not contain zero). Conversely, if the Cl contains zero (i.e., the upper bound is above zero), the difference is not statistically significant. In the present example, the benchmark models exhibit high calibration accuracy; therefore, a confidence interval that includes zero indicates statistical noninferiority, which is considered a positive validation outcome. Consequently, if a disclosed ML model achieves superior discrimination (higher AUC) without compromising the calibration reliability established by the standard of care, it represents an improvement over the benchmark.
[0274] Figs.19A-C illustrate evaluation of the prediction reliability of the LR models trained to predict an adverse event after treatment of STEMI as described in Example 2 (in the figure referred to as “PREMIERPERISCOPE”). The “all” suffix used in the figure indicates that the analysis includes all patients rather than a specific subgroup. The top panels of each of Figs.19A-C presents a calibration plot, wherein the predicted probabilities (x-axis) are plotted against observed true probabilities (y-axis). The dashed diagonal line represents a perfect calibration. A curve falling above it indicates underestimation of risk, while a curve falling below indicates overestimation. The shaded regions correspond to 95% confidence intervals, representing statistical uncertainty. The dashed vertical lines mark specific clinical decision thresholds (rulein and rule-out). The central panel of each of Figs.19A-C illustrate distributions of predicted probabilities, higher density of data points indicate where predictions are more concentrated versus where they are sparser.
[0275] Synthetic feature ablation. A feature ablation analysis was conducted to assess the impact of missing features. For each combination of numeric features, missingness was set to 100% and values were imputed using the training data median. Fig.20 shows a feature ablation plot. The feature ablation plot quantifies the impact of missing data on model performance. The y-axis lists the specific feature combinations that are completely removed, while the x-axis measures the resulting change in the AUC. The (green) circles indicate model's original baseline performance. The (black) triangles indicate the reduced performance estimates after the corresponding features are set to missing, wherein the 95% confidence intervals of these new scores are indicated with horizontal lines. Vertical dashed lines provide reference standards, wherein the central (orange) line provide reference standard for the current benchmark applied on the same data and a (green) line on the left side provide reference standard for the success criterion based on what has been reported of the same benchmark in literature.
[0276] Subgroup missingness. To test the effect of the quantity of missing data, a subgroup analysis was performed by the inventors. Scenarios with one and two missing numeric features were simulated. The analysis measure how the model's AUC degrades as the number of missing features increases. The model's AUC will be compared to the benchmarks on the same data to see if they remain above the primary success criteria.
[0277] 2.2. Results
[0278] The following section summarizes the evaluation results of the ML models trained for each endpoint (in- hospital mortality after treatment of STEMI, 6-month mortality after treatment of STEMI, and the risk of 30- day readmission at hospital after discharge of the patient). The results are presented in the following sequence: (I) Comparison of the calibrated and threshold-adjusted model performance, specifically AUC, of the different types of ML models compared to benchmark standard of care clinical score, specifically GRACE 2.0 or LACE index (Tables 30, 33, 36); (II) the LR models for each of the endpoints are further discussed in detail: wherein (i) the optimized hyperparameters and the final feature list SHAP values (Fig. 18) are shown, (ii) comparison between the LR models for each of the endpoints and benchmark standard of care clinical score, specifically GRACE 2.0 or LACE index is provided and (iii) model evaluation based on Brier score (Fig. 19) is shown, and (iv) feature ablation analysis (impact on missingness of features) (Fig. 20) is shown. For direct comparison with the trained models, the performance (AUC) of the benchmark clinical scores, GRACE 2.0 and LACE index, was determined using the same training dataset (Premier Health Solutions). The different training datasets in Example 1 and 2 results in different AUC values for the benchmark scores.
[0279] 2.2.1 . Evaluation of trained ML models for predicting risk of in-hospital mortality after treatment of STEMI The predictive features of the different types of trained ML models are shown in Table 29. The comparison of the model performance (in terms of AUC value), of the different types of trained ML models compared to the benchmark standard of care clinical score GRACE 2.0 (IH) are shown in Table 30. The Gradient boosting algorithms (XGboost (AUC = 0.910) or CatBoost (AUC= 0.910)), the Random Forest model (AUC=0.908) and Logistic regression classifier (AUC=0.900) performed similarly well and achieved performance results of AUC noticeably higher than the benchmark score GRACE 2.0 (AUC = 0.74). The LR model was chosen as the superior model because it offers greater simplicity, interpretability, and costeffectiveness (can be implemented with a lightweight formula), and achieved high AUC with three fewer features than Random Forest and two fewer than CatBoost or XGBoost).
[0280] Table 30. Comparison of the model performance (in terms of AUC value and its 95% confidence Interval (Cl)), of the different types of the trained ML models trained to predict in-hospital mortality after treatment of STEMI.
[0281] The SHAP values of the final feature list of the trained LR model for predicting the risk of in-hospital mortality after treatment of STEMI are shown in Fig. 18A. The SHAP values (mean absolute across all instances) indicate that the most influential features for predicting in-hospital outcomes are patient's age (mean SHAP value: 0.5469), cardiogenic shock (mean SHAP value: 0.5153), cardiac arrest (mean SHAP value: 0.41 16). Serum Albumin (mean SHAP value: 0.2777), congestive heart failure (mean SHAP value: 0.1767), white blood cell count (mean SHAP value: 0.1460), cerebrovascular disease (mean SHAP value: 0.1435) also all contributed to the model's decision-making process, albeit at lower levels.
[0282] The comparison between the LR model trained to predict the risk of in-hospital mortality after treatment of STEMI (indicated in Table 31 as “Disclosure”) and the benchmark standard of care clinical score GRACE 2.0 (IH) is provided in Table 31. The in-hospital mortality model of the disclosure demonstrates superior performance over the GRACE 2.0 benchmark across key metrics. The model's rule-in risk ratio is significantly higher, with a lower bound of 20.350 surpassing the upper bound of GRACE 2.0 (18.120), and its Positive Predictive Value (PPV) and Negative Predictive Value (NPV) are statistically superior, indicating better predictive capabilities for both rule-in and rule-out scenarios. While GRACE 2.0 exhibits a higher rule-in specificity in this comparison, this is a function of the specific threshold tuning strategy employed rather than an intrinsic limitation of the new model; thresholds were set based on literature-derived targets for general cohorts (aiming for 95% sensitivity and 70% specificity) rather than being optimized solely for this dataset. The true measure of the model's discriminatory power is the Area Under the Curve (AUC), where the model of the disclosure significantly outperforms GRACE 2.0 (0.900 vs 0.855). This superior AUC indicates a better overall trade-off capability. Verification confirms that if the model's thresholds are adjusted to match GRACE’S specificity (0.756), it yields a superior PPV (0.233 vs 0.215), proving that for any chosen operating point, the model of the disclosure offers greater diagnostic precision.
[0283] Table 31. The performance metrics for rule-in and rule-out of LR model for predicting the risk of in-hospital mortality after treatment of STEMI and GRACE 2.0 (IH) score. Cl=confidence interval.
[0284] The Brier score comparison between the LR model trained to predict the risk of in-hospital mortality after treatment of STEMI and the benchmark standard of care clinical score GRACE 2.0 (IH) is shown in Fig. 19A. The top panel of Fig.19A presents a calibration plot, wherein the predicted probabilities (x-axis) are plotted against observed true probabilities (y-axis). The dashed vertical lines, parallel to y-axis on the left side of the panel mark specific clinical decision thresholds: the rule-in threshold 0.039 and the rule-out threshold 0.032. For direct comparison with the trained models, the Brier score of the benchmark clinical score GRACE 2.0 (IH) was determined using the same training dataset (Premier Health Solutions). In this comparison, the LR classifier achieved a Brier score of 0.056, whereas the benchmark model GRACE 2.0 (IH) had a Brier score of 0.060, indicating similar performance (slightly better for the LR classifier). The bootstrap evaluation of the confidence interval around the difference in Brier score between the respective rule-out and rule-in threshold, shows that the LR model trained to predict in-hospital mortality provided a 95% confidence interval estimate pf (0.011 , 0.024), indicating that the difference between the models on this particular metric is not statistically significant. This is still considered a positive validation outcome, as the disclosed LR model trained to predict in-hospital mortality shows improvement over the benchmark GRACE 2.0 by delivering significantly better discrimination (improved AUC, Table 31 ) while maintaining a level of calibration reliability comparable to that of the benchmark.
[0285] To investigate the impact of feature missingness on the performance of the model feature ablation analyses were conducted, replicating scenarios where certain features are missing or unobtainable. This analysis examined scenarios where certain numeric features were missing by setting their missingness to 100% and subsequently imputing the missing values with the median from the training data. The primary endpoint for evaluation was the Area Under the Curve (AUC). The results of the feature ablation analysis of the LR model trained to predict the risk of in-hospital mortality after treatment of STEMI are shown in Fig. 20A. For all feature ablation combinations, the AUC remained above the success criterion of 0.80. The robustness of the disclosed model's ability to maintain predictive accuracy despite missing data represents a significant advantage over the GRACE 2.0 score, which requires complete datasets to generate a risk score.
[0286] Additionally subgroup missingness was investigated. This analysis examines how the model's performance, as measured by AUC, degrades as the quantity of missing features increases, regardless of which specific features are absent. The results of feature subgroup ablation analysis of the LR model trained to predict the risk of in-hospital mortality after treatment of STEMI are shown in Fig. 20B. Since the 'age' feature was always available, scenarios with one or two missing features were evaluated. With one missing feature, the AUC decreased by approximately 0.01 . With two missing features, the AUC decreased further by 0.01 . However, in both scenarios, the AUC remained above the success criterion of 0.80.
[0287] 2.2.2. Evaluation of trained ML models for predicting risk of 6-month mortality after treatment of STEMI
[0288] The predictive features of the different types of trained ML models are shown in Table 32. The comparisons of the model performance (in terms of AUC value), of the different types of the trained ML models compared to the benchmark standard of care clinical score GRACE 2.0 (6M) are shown in Table 33. The Gradient boosting algorithms (XGboost (AUC = 0.904) or CatBoost (AUC= 0.905)), Logistic regression classifier (AUC=0.901 ) performed similarly well and achieved performance results of AUC noticeably higher than the benchmark score GRACE 2.0 (AUC = 0.74). The Random Forest model (AUC=0.897) also performed well and noticeably better than benchmark score GRACE 2.0. The LR model was chosen as the superior model because it offers greater simplicity, interpretability, and costeffectiveness (can be implemented with a lightweight formula), with similar performance as more complex models.
[0289] Table 33. Comparison of the model performance (in terms of AUC value and its 95% confidence Interval (Cl)), of the different types of the trained ML models trained to predict 6-month mortality after treatment of
[0290] ST EM I.
[0291] The SHAP values of the final feature list of the LR model trained to predict the risk of 6-month mortality after treatment of STEMI are shown in Fig. 18B. The SHAP values (mean absolute across all instances) indicate that the most influential feature is cardiogenic shock (mean SHAP value: 0.7245). The features that are also powerful predictors are: cardiac arrest (mean SHAP value: 0.5889), age (mean SHAP value: 0.5597). Serum albumin (mean SHAP value: 0.2696), congestive heart failure (mean SHAP value: 0.2208), Blood Urea Nitrogen (BUN) (mean SHAP value: 0.2029), history of Cerebrovascular Disease (mean SHAP value: 0.1787) and elevated White Blood Cell Count (mean SHAP value: 0.1615) all also contributed to the risk prediction, albeit at a lower level.
[0292] The comparison between the LR model trained to predict the risk of in-hospital mortality after treatment of STEMI (indicated in Table 34 as “Disclosure”) and the benchmark standard of care clinical score GRACE 2.0 (6M) is provided in Table 34. The 6-month mortality model of the disclosure is clearly outperforming GRACE 2.0 (6M). The model of the disclosure for predicting 6-month mortality model has a rule-in risk ratio that is significantly higher than the prior art, as its lower bound of 10.570 exceeds the upper bound of GRACE 2.0 (6M) risk ratio (10.290). Both models show similar and high rule-out sensitivity. While GRACE 2.0 (6M) exhibits a slightly higher rule-in specificity in this comparison, this reflects a deliberate calibration choice based on literature-reported targets rather than an inherent model limitation. Specifically, the inventors chose their sensitivity and specificity based on what is reported in literature and not specific to this dataset. The model can be tuned to have the same specificity as GRACE (simply by using a different threshold), which would result in the same specificity as GRACE but with a better positive predictive value (PPV) since the model is inherently better at discriminating (as evidenced by the significantly higher AUC indicating superior intrinsic discriminatory power). This additionally means that when implementing the proposed model at a hospital, if the hospital is resources-constrained, they can increase the threshold which would result in ruling in less patients and having a better Specificity compared to GRACE. The rule-out NPV and rule-in PPV are comparable, with the 6-month mortality model of the disclosure showing a slightly better NPV than the prior art. The 6-month mortality model of the disclosure has an AUC that is significantly better than that of the prior art model (GRACE 2.0).
[0293] Table 34. The performance metrics for rule-in and rule-out of LR model for predicting the risk of 6-month mortality after treatment of STEMI and GRACE 2.0 (6M) score.
[0294] The Brier score comparison between the LR model trained to predict the risk of 6-month mortality after treatment of STEMI and the benchmark standard of care clinical score GRACE 2.0 (6M) is shown in Fig. 19B. The top panel of Fig.19B presents a calibration plot, wherein the predicted probabilities (x-axis) are plotted against observed true probabilities (y-axis). The dashed vertical lines, parallel to y-axis on the left side of the panel mark specific clinical decision thresholds: the rule-in threshold 0.1 12 and the rule-out threshold 0.069. For direct comparison with the trained models, the Brier score of the benchmark clinical score GRACE 2.0 (6M) was determined using the same training dataset (Premier Health Solutions). In this comparison, the LR classifier achieved a slightly better Brier score of 0.10, compared to the benchmark model GRACE 2.0 (6M) which had a Brier score of 0.14. This suggests that both models are similarly calibrated given this data, with a marginal advantage for the Logistic Regression classifier. The bootstrapped evaluation of the difference in Brier score within the estimated relevant region, which is between the respective rule-out and rule-in threshold, shows that the LR model trained to predict 6-month mortality is statistically superior (95% confidence interval on difference in Brier scores (-0.046, 0.005)).
[0295] To investigate the impact of feature missingness on the performance of the model, feature ablation analyses were conducted, simulating scenarios where certain features are missing or unobtainable. This analysis specifically examined scenarios where certain numeric features were missing by setting their missingness to 100% and subsequently imputing the missing values with the median from the training data. The primary endpoint for evaluation was AUC. The results of the feature ablation analysis for the LR model trained to predict the risk of 6-month mortality after treatment of STEMI are shown in Fig. 20C. The AUC remained above the success criterion of 0.70 with all possible feature ablation combinations. The disclosed model's ability to maintain predictive accuracy despite missing data shows a significant advantage over the GRACE 2.0 score, which requires complete datasets to generate a risk score.
[0296] Additionally, subgroup missingness was investigated. This analysis examines how the model performance, as measured by AUC, degrades as the quantity of missing features increases, regardless of which specific features are absent. The results of feature subgroup ablation analysis for the LR model trained to predict the risk of 6-month mortality after treatment of STEMI are shown in Fig. 20D. Since the 'age' feature was always available, scenarios with one or two missing features were evaluated. With one missing feature, the AUC increased by 0.01 , and with two missing features, the AUC decreased further by 0.02. In both scenarios, the AUC remained above the success criterion of 0.70.
[0297] 2.2.3. Evaluation of trained ML models for predicting risk of 30-day readmission after treatment of STEMI The predictive features of the different types of ML models trained to predict risk of 30-day readmission are shown in Table 35. The comparison of the model performance (in terms of AUC value), of the different types of trained ML models compared to the benchmark standard of care clinical score LACE are shown in Table 36. The Gradient boosting algorithms (XGboost (AUC = 0.664), CatBoost (AUC= 0.668)), Random Forest model (AUC=0.650), Logistic regression classifier (AUC=0.665) all performed similarly well and achieved performance results of AUC higher than the benchmark score LACE (AUC = 0.64). The LR model was chosen as the superior model because it offers greater simplicity and performs equally well with fewer predictive features.
[0298] Table 36. The comparison of the model performance (in terms of AUC value and its 95% confidence Interval (Cl)), of the different types of the trained ML models trained to predict 30-day readmission after treatment of STEMI.
[0299] The SHAP values of the final feature list of the LR model trained to predict the risk of 30-day readmission after treatment of STEMI are shown in Fig. 18C. The SHAP analysis for the 30-day readmission model shows the most influential features are the number of ED visits in the last 6 months (mean SHAP value: 0.1350), Blood Urea Nitrogen (BUN) (mean SHAP value: 0.0930), congestive heart failure (mean SHAP value: 0.0875). Other important predictors include chronic pulmonary disease (mean SHAP value: 0.0810), hemoglobin (mean SHAP value: 0.0755). Renal disease (mean SHAP value: 0.0737) and admission duration (mean SHAP value: 0.0530) also contributed notably to the predictions, albeit at lower levels.
[0300] The comparison between the LR model trained to predict the risk of 30-day readmission after treatment of STEMI (in Table 37 indicated as disclosure) and the benchmark standard of care clinical score LACE is provided in Table 37. The 30-days readmission model of the present disclosure has a higher AUC than the prior art model (LACE), and the rule-in risk ratio for the 30-day readmission model (2.320, 95% Cl: 1 .950 - 2.800) is higher than that of the LACE model (1 .840, 95% Cl: 1 .580 - 2.150), although for both metrics the confidence intervals do significantly overlap. LACE is statistically superior in rule-in specificity, with its lower bound of 0.900 exceeding the upper bound of the 30-day readmission models specificity (0.810). The other metrics, including rule-out sensitivity, rule-out NPV, and rule-in PPV, are not significantly different between the two models.
[0301] Table 37. The performance metrics for rule-in and rule-out of LR model for predicting the risk of 30-day readmission after treatment of STEMI and LACE score. The Brier score comparison between the Logistic Regression model trained to predict the risk of 30-day readmission at hospital after discharge of the patient and the benchmark standard of care clinical score LACE index is shown in Fig. 19C. The top panel of Fig.19C presents a calibration plot, wherein the predicted probabilities (x-axis) are plotted against observed true probabilities (y-axis). The dashed vertical lines, parallel to y-axis on the left side of the panel mark specific clinical decision thresholds: the rule-in threshold 0.213 and the rule-out threshold 0.137. For direct comparison with the trained models, the Brier score of the benchmark clinical score LACE was determined using the same training dataset (Premier Health Solutions). For the 30-day readmission endpoint prediction the Logistic Regression classifier achieved a Brier score of 0.14, while the benchmark model LACE index had a Brier score of 0.16. This suggests that both models are similarly calibrated given the training data, with the Logistic Regression classifier performing slightly better. The bootstrap evaluation of the difference in Brier score within the estimated relevant region, which is between the respective rule-out and rule-in threshold, shows that the LR model trained to predict 30-day readmission is statistically superior (95% confidence interval on difference in Brier scores (-0.037, -0.010)).
[0302] To investigate the impact of feature missingness on the performance of the model, feature ablation analyses were conducted, simulating scenarios where certain features are missing or unobtainable. This analysis specifically examined scenarios where certain numeric features were missing by setting their missingness to 100% and subsequently imputing the missing values with the median from the training data. The primary endpoint for evaluation was AUC. The results of the feature ablation analysis for the LR model trained to predict the risk of 30-day readmission after treatment of STEMI are shown in Fig. 20E. The model’s baseline performance was superior to the required success criterion of 0.60 for all scenarios. The disclosed model's ability to maintain predictive accuracy despite missing data shows a significant advantage over the LACE score, which requires complete datasets to generate a risk score.
[0303] Subgroup missingness was also investigated. This analysis examines how the performance of the model, as measured by AUC, degrades as the quantity of missing features increases, regardless of which specific features are absent. The results of feature subgroup ablation analysis for the LR model trained to predict the risk of 6-month mortality after treatment of STEMI are shown in Fig. 20F. In the example described herein, the features "number of ED visits last 6 months" and "admission duration" were derived via data aggregation, rendering them complete for all patients in the cohort. The evaluation of subgroup missingness was restricted to instances where data missingness was observed within the dataset. With one missing feature, the AUC decreased by 0.01 , and with two missing features, the AUC decreased further by 0.02. In both scenarios, the AUC remained above the success criterion of 0.60.
[0304] 2.2.3. Primary endpoint
[0305] The primary endpoint for evaluating algorithm performance is AUROC. The objective of the inventors was to demonstrate that the trained ML models are statistically superior to the defined benchmark AUCs which have been derived from literature. Therefore, the lower bound of the 95% confidence interval should be higher than the pre-defined AUC. As can be seen in Table 38 below, the models described herein surpassed the success criteria in the tune set for all three outcomes.
[0306] In Table 38: (i) Required Critical Value Z is a statistical multiplier (Z = 1.96 for a 95% confidence level) used in the Hanley & McNeil formula to scale the standard error, thereby determining the margin of error for the precision-based sample size calculation, (ii) Actual Prevalence: The observed proportion of positive cases (instances where the predicted endpoint adverse event occurred) within the specific patient cohort used for the evaluation, (iii) Actual Sample Size: The total number of patients included in the evaluation dataset, (iv) Calculated Confidence Interval (D*2): The total width of the 95% confidence interval surrounding the AUC, calculated as twice the margin of error (D). (v) Required Superiority Margin (SC): A pre-defined performance threshold added to the benchmark score to establish the minimum degree of improvement required to consider the new model successfully superior, (vi) Actual AUC: The observed Area Under the Receiver Operating Characteristic Curve achieved by the trained model on the evaluation dataset, (vii) 95% Cl Lower Bound (AUC - D . The conservative estimate of the model's performance, calculated by subtracting the margin of error (D) from the Actual AUC. (viii) AUC benchmark: The AUC performance score of the standard-of-care clinical tool (e.g., GRACE 2.0) used as the reference for comparison. This value is derived from relevant scientific literature and is different from the performance score obtained by applying the benchmark tool directly to the current validation dataset, (ix) AUCbenchmark + SC - (AUCpoint - D): A comparison metric calculating the gap between the required success threshold (Benchmark + Superiority Margin (SC)) and the model's conservative performance estimate (Lower Bound), (x) Significantly_better: A binary outcome indicating whether the model's 95% Cl Lower Bound is strictly greater than the sum of the AUC benchmark and the Required Superiority Margin.
[0307] Table 38. Evaluation of primary endpoint for trained LR models, on tune dataset.
[0308] 2.3. Conclusions
[0309] In conclusion, the trained ML models described in this example for each of the endpoints successfully demonstrated statistically significant performance improvements over traditional baseline models for predicting key post-STEMI outcomes. The algorithm's superior performance in AUC exceeded the predefined benchmarks for in-hospital mortality (0.90 vs. 0.80), 6-month mortality (0.901 vs. 0.70), and 30- day readmission (0.665 vs. 0.60).
[0310] Further, all of the proposed new algorithms demonstrated exceptional robustness, maintaining performance above their success criteria under simulated data noise and missingness. The successful meeting of the primary AUC endpoint and this resilience collectively indicate the reliability of the new methods. The selection of Logistic Regression as the final model for all three endpoints, despite similar performance from more complex models, was based on its interpretability and ease of deployment. This choice represents a practical approach that maintains predictive power without unnecessary complexity.
[0311] The outputs of the algorithms described in this example can be utilized in clinical practice. The data-driven risk stratification provided by these algorithms can assist clinicians in making decisions about patient care, resource allocation, and follow-up treatment. For example, the identification of patients at high risk for 30- day readmission could support the implementation of targeted interventions at discharge, such as personalized follow-up plans.
[0312] References
[0313] All references cited herein are incorporated herein by reference in their entirety and for all purposes to the same extent as if each individual publication or patent or patent application was specifically and individually indicated to be incorporated by reference in its entirety.
[0314] Fox KA, et al. Should patients with acute coronary disease be stratified for management according to their risk? Derivation, external validation and outcomes using the updated GRACE risk score. BMJ Open. 2014 Feb 21 ;4(2):e004425. doi: 10.1 136 / bmjopen-2013-004425. PMID: 24561498; PMCID: PMC3931985. van Walraven C, et al. Derivation and validation of an index to predict early death or unplanned readmission after discharge from hospital to the community. CMAJ. 2010 Apr 6;182(6):551-7. doi: 10.1503 / cmaj.091117. Epub 2010 Mar 1. PMID: 20194559; PMCID: PMC2845681 .
[0315] Fox KA, et al. Prediction of risk of death and myocardial infarction in the six months after presentation with acute coronary syndrome: prospective multinational observational study (GRACE). BMJ. 2006 Nov 25;333(7578):1091. doi: 10.1136 / bmj.38985.646481.55 Deng, L., Zhao, X., Su, X. et al. Machine learning to predict no reflow and in-hospital mortality in patients with ST -segment elevation myocardial infarction that underwent primary percutaneous coronary intervention. BMC Med Inform Decis Mak 22, 109 (2022). https: / / doi.org / 10.1186 / s1291 1-022-01853-2
[0316] Gupta S, et al. Evaluation of Machine Learning Algorithms for Predicting Readmission After Acute Myocardial Infarction Using Routinely Collected Clinical Data. Can J Cardiol. 2020 Jun;36(6):878-885. doi: 10.1016 / j.cjca.2019.10.023. Epub 2019 Oct 25. PMID: 32204950.
[0317] Robert A Byrne, et al , 2023 ESC Guidelines for the management of acute coronary syndromes: Developed by the task force on the management of acute coronary syndromes of the European Society of Cardiology (ESC), European Heart Journal. Acute Cardiovascular Care, Volume 13, Issue 1 , January 2024, Pages 55-161 , https: / / doi.org / 10.1093 / ehjacc / zuad107
[0318] O'Gara PT, et al. 2013 ACCF / AHA guideline for the management of ST-elevation myocardial infarction: a report of the American College of Cardiology Foundation / American Heart Association Task Force on Practice Guidelines. Circulation. 2013 Jan 29;127(4):e362-425. doi: 10.1161 / CIR.0b013e3182742cf6. Epub 2012 Dec 17. Erratum in: Circulation. 2013 Dec 24;128(25):e481 . PMID: 23247304.
[0319] Ibanez B, et al. 2017 ESC Guidelines for the management of acute myocardial infarction in patients presenting with ST-segment elevation: The Task Force for the management of acute myocardial infarction in patients presenting with ST-segment elevation of the European Society of Cardiology (ESC). Eur Heart J. 2018 Jan 7;39(2):1 19-177. doi: 10.1093 / eurheartj / ehx393. PMID: 28886621.
[0320] Singh, S., et al. (2024). Risk assessment in ACS using grace score. INDIAN JOURNAL OF APPLIED RESEARCH, 14(9), September-2024.
[0321] Elbarouni, B. et al. (2009). Validation of the Global Registry of Acute Coronary Event (GRACE) risk score for in-hospital mortality in patients with acute coronary syndrome in Canada. American Heart Journal, 158(3), 392-399.
[0322] Tran, A. V., et al (2024). Prognostic value of in-hospital and 6-month mortality after acute coronary syndrome using GRACE, TIMI, and HEART scores. Medicina Clfnica Practica, 7(2).
[0323] Wontor R, Olpihska B, Loboz-Rudnicka M, et al. (2024). Addition of the Tilburg Frailty Indicator to the classic Global Registry of Acute Coronary Events risk score improves its prognostic value in elderly patients with acute coronary syndrome. Pol Arch Intern Med. 2024; 134: 16862. doi:10.20452 / pamw.16862.
[0324] Abu-Assi, E., et al. (2010). Validation of the GRACE risk score for predicting death within 6 months of follow-up in a contemporary cohort of patients with acute coronary syndrome. Revista Espanola de Cardiologfa (English Edition), 63(6), 640-648.
[0325] Baig, M., et al. (2018). Evaluation of Patients at Risk of Hospital Readmission (PARR) and LACE Risk Score for New Zealand Context. In: Connecting the System to Enhance the Practitioner and Consumer Experience in Healthcare (Volume 252). IOS Press, pp. 21-26.
[0326] Rajaguru, V., et al. (2022). LACE Index to Predict the High Risk of 30-Day Readmission in Patients With Acute Myocardial Infarction at a University Affiliated Hospital. Frontiers in Cardiovascular Medicine, 9, 925965. Dobler, C. C., et al (2020). Ability of the LACE index to predict 30-day hospital readmissions in patients with community-acquired pneumonia. ERJ Open Research, 6(2), 00301-2019. doi:10.1 183 / 23120541.00301-2019.
[0327] Equivalents and Scope
[0328] Unless context dictates otherwise, the descriptions and definitions of the features set out above are not limited to any particular aspect or embodiment of the invention and apply equally to all aspects and embodiments which are described, “and / or” where used herein is to be taken as specific disclosure of each of the two specified features or components with or without the other. For example “A and / or B” is to be taken as specific disclosure of each of (i) A, (ii) B and (iii) A and B, just as if each is set out individually herein. It must be noted that, as used in the specification and the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. Ranges may be expressed herein as from “about” one particular value, and / or to “about” another particular value. When such a range is expressed, another embodiment includes from the one particular value and / or to the other particular value. Similarly, when values are expressed as approximations, by the use of the antecedent “about” or “approximately”, it will be understood that the particular value forms another embodiment. The terms “about” or “approximately” in relation to a numerical value is optional and means for example + / - 10%.
[0329] Throughout this specification, including the claims which follow, unless the context requires otherwise, the word “comprise” and “include”, and variations such as “comprises”, “comprising”, and “including” will be understood to imply the inclusion of a stated integer or step or group of integers or steps but not the exclusion of any other integer or step or group of integers or steps. Other aspects and embodiments of the invention provide the aspects and embodiments described above with the term “comprising” replaced by the term “consisting of’ or ’’consisting essentially of’, unless the context dictates otherwise. The features disclosed in the foregoing description, or in the following claims, or in the accompanying drawings, expressed in their specific forms or in terms of a means for performing the disclosed function, or a method or process for obtaining the disclosed results, as appropriate, may, separately, or in any combination of such features, be utilized for realizing the invention in diverse forms thereof. While the invention has been described in conjunction with the exemplary embodiments described above, many equivalent modifications and variations will be apparent to those skilled in the art when given this disclosure. Accordingly, the exemplary embodiments of the invention set forth above are considered to be illustrative and not limiting. Various changes to the described embodiments may be made without departing from the spirit and scope of the invention. For the avoidance of any doubt, any theoretical explanations provided herein are provided for the purposes of improving the understanding of a reader. The inventors do not wish to be bound by any of these theoretical explanations. Any section headings used herein are for organizational purposes only and are not to be construed as limiting the subject matter described.
Claims
Claims1. A computer-implemented method of providing a prognosis for a patient who has been treated for ST-segment-elevation myocardial infarction (STEMI), the method comprising: receiving the values of a plurality of predetermined features associated with the patient, the predetermined features comprising: one or more patient demographic features, one or more hospital admission history features, one or more clinical history features, one or more vital signs features and / or one or more laboratory tests features; and predicting, using the values of said plurality of features, a prognosis for the patient, wherein said predicting comprises using one or more machine learning models to predict a risk of the patient experiencing one or more respective post-treatment complications, wherein the one or more respective post-treatment complications are selected from: readmission within a first predetermined period of time, in-hospital mortality, and mortality within a second predetermined period of time, and wherein each of said one or more machine learning models has been trained to predict the risk of one of said post-treatment complications using training data comprising, for each of a plurality of patients who have been treated for STEMI: (i) the values of a predetermined set of said plurality of features and (ii) an indication of whether the patient has suffered from the post-treatment complication; optionally wherein receiving the values of the plurality of predetermined features comprises sending a query to an Electronic Medical Records (EMR) system and receiving the values from said EMR system; and / or wherein the method comprises receiving a selection of one or more of the posttreatment complications and selecting a machine learning model from a set of machine learning models to predict each selected post-treatment complications, the set of machine learning models comprising a machine learning model trained to predict readmission within the first predetermined period of time, a machine learning model trained to predict in-hospital mortality, and a machine learning model trained to predict mortality within the second predetermined period of time.
2. The method of claim 1 , wherein the plurality of predetermined features comprises a first set of features comprising:(i) a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, and a feature indicative of whether the patient has suffered cardiogenic shock, and(ii) a plurality of laboratory test features comprising: a feature indicative of the patient’s white blood cell count, and optionally a feature indicative of the patient’s serum albumin level and / or one or more demographic features comprising a feature indicative of the patient’s age;wherein predicting, using the values of said plurality of features, a prognosis for the patient comprises using a machine learning model that has been trained to predict the risk of in-hospital mortality using training data comprising for each of a plurality of patients who have been treated for STEMI: (i) the values of said first set of features and (ii) an indication of whether the patient has suffered from in-hospital mortality after treatment for STEMI.
3. The method of claim 2, wherein the first set of features further comprises: one or more clinical history features selected from: a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered atrial arrhythmia, a feature indicative of whether the patient has suffered cerebrovascular disease; one or more demographic features comprising a feature indicative of the patient’s age; one or more laboratory test features selected from: a feature indicative of the patient’s serum creatinine level, a feature indicative of the patient’s blood urea nitrogen (BUN) level, a feature indicative of patient’s platelet count, and a feature indicative of the patient’s serum albumin level; and / or one or more vital signs features selected from: a feature indicative of the patient's systolic blood pressure, and a feature indicative of the patient's oxygen saturation level.
4. The method of claim 2 or claim 3, wherein the first set of features comprises:(i) a plurality of clinical history features including a feature indicative of whether the patient has suffered cardiac arrest, a feature indicative of whether the patient has suffered cardiogenic shock, and at least one further feature selected from: a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered atrial arrhythmia, a feature indicative of whether the patient has suffered cerebrovascular disease; and / or(ii) a plurality of clinical history features including a feature indicative of whether the patient has suffered cardiac arrest and a feature indicative of whether the patient has suffered cardiogenic shock, and a plurality of laboratory test features including a feature indicative of the patient’s white blood cell count and a feature indicative of the patient’s platelet count, and one or more demographic features comprising a feature indicative of the patient’s age; and / or(iii) a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, and a feature indicative of whether the patient has suffered cardiogenic shock, and one or more demographic features comprising a feature indicative of the patient’s age; and / or(iv) a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, and a feature indicative of whether the patient has suffered cardiogenic shock, one or more demographic features comprising a feature indicative of thepatient’s age, and one or more laboratory test features including a feature indicative of the patient’s platelet count or a feature indicative of the patient’s serum albumin level; and / or(v) a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, a feature indicative of whether the patient has suffered cardiogenic shock, and a feature indicative of whether the patient has suffered congestive heart failure, one or more demographic features comprising a feature indicative of the patient’s age, and one or more laboratory test features including a feature indicative of the patient’s serum albumin level; and / or(vi) a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, a feature indicative of whether the patient has suffered cardiogenic shock, and a feature indicative of whether the patient has suffered congestive heart failure, one or more demographic features comprising a feature indicative of the patient’s age, and a plurality of laboratory test features including a feature indicative of the patient’s white blood cell count and a feature indicative of the patient’s serum albumin level; and / or(vii) at least 5 of the following 7 features: a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, a feature indicative of whether the patient has suffered cardiogenic shock, a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered cerebrovascular disease, a plurality of laboratory test features comprising: a feature indicative of the patient's white blood cell count, a feature indicative of the patient's serum albumin level, and one or more demographic features comprising a feature indicative of the patient's age wherein the at least 5 features include at least one of a feature indicative of the patient's white blood cell count, and a feature indicative of the patient's serum albumin level.
5. The method of claim 3 or claim 4, wherein the first set of features comprises:(i) a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, and a feature indicative of whether the patient has suffered cardiogenic shock, a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered atrial arrhythmia, a feature indicative of whether the patient has suffered cerebrovascular disease; a plurality of laboratory test features comprising: a feature indicative of the patient’s white blood cell count, and a feature indicative of the patient’s serum albumin level a feature indicative of the patient’s serum creatinine level, a feature indicative of the patient’s blood urea nitrogen (BUN) level, a feature indicative of patient’s platelet count; a plurality of vital signs features comprising: a feature indicative of the patient's systolic blood pressure, and a feature indicative of the patient's oxygen saturation level; and105one or more demographic features comprising a feature indicative of the patient’s age; or (ii) a plurality of clinical history features comprising: a feature indicative of whether the patient has suffered cardiac arrest, a feature indicative of whether the patient has suffered cardiogenic shock, a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered cerebrovascular disease, a plurality of laboratory test features comprising: a feature indicative of the patient's white blood cell count, a feature indicative of the patient's serum albumin level, and one or more demographic features comprising a feature indicative of the patient's age.
6. The method of any preceding claim, wherein the values of patient demographic features are associated with the time of the prediction, and values of vital signs features and / or laboratory test features are associated with one or more measurements at a latest available date and / or within a predetermined period of time preceding an index date, or a summarized value derived from a plurality of said values, wherein the index date for prediction of in-hospital mortality, and mortality within a second predetermined period of time is the latest date at which the patient has been diagnosed with STEMI prior to receiving treatment for STEMI, or the index date for prediction of readmission within a first predetermined period of time is the date at which the predicting is performed or the date of discharge of the patient.
7. The method of any preceding claim, wherein the plurality of predetermined features comprises a second set of features comprising: a plurality of clinical history features comprising a feature indicative of whether the patient has suffered cardiac arrest and one or both of a feature indicative of whether the patient has suffered congestive heart failure and a feature indicative of whether the patient has suffered a cardiogenic shock, and optionally one or more demographic features comprising a feature indicative of the patient’s age, and / or one or more laboratory test features comprising a feature indicative of the patient’s serum albumin level; and wherein predicting, using the values of said plurality of features, a prognosis for the patient comprises using a machine learning model that has been trained to predict the risk of mortality within the second predetermined period after treatment of STEMI using training data comprising for each of a plurality of patients who have been treated for STEMI: (i) the values of said second plurality of features and (ii) an indication of whether the patient has suffered from mortality within the predetermined period of time after treatment of STEMI.
8. The method according to claim 7, wherein the second plurality of features further comprises:106one or more of clinical history features selected from: a feature indicative of whether the patient has suffered cerebrovascular disease, a feature indicative of whether the patient has suffered rales and / or jugular venous distension (JVD), and a feature indicative of whether the patient has suffered atrial arrhythmia; one or more of laboratory test features selected from: a feature indicative of the patient’s blood urea nitrogen (BUN) level, a feature indicative of the patient’s white blood cell count, a feature indicative of the patient’s serums creatinine level, a feature indicative of the patient’s platelet count, a feature indicative of patient hemoglobin level; and / or one or more vital signs features comprising a feature indicative of the patient's systolic blood pressure.
9. The method according to claim 7 or claim 8, wherein the second set of features comprises:(i) a plurality of clinical history features including a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered cardiac arrest, and one or more further clinical history features selected from: a feature indicative of whether the patient has suffered cerebrovascular disease, a feature indicative of whether the patient has suffered rales and / or jugular venous distension (JVD), a feature indicative of whether the patient has suffered atrial arrhythmia, and a feature indicative of whether the patient has suffered cardiogenic shock; and / or(ii) a plurality of clinical history features including a feature indicative of whether the patient has suffered congestive heart failure and a feature indicative of whether the patient has suffered cardiac arrest, and one or more laboratory test features selected from: a feature indicative of the patient’s white blood cell count, a feature indicative of the patient’s platelet count, and a feature indicative of patient hemoglobin level; and / or(iii) a plurality of clinical history features comprising a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered cardiac arrest a feature indicative of whether the patient has suffered cerebrovascular disease, a feature indicative of whether the patient has suffered rales and / or jugular venous distension (JVD), a feature indicative of whether the patient has suffered atrial arrhythmia, and a feature indicative of whether the patient has suffered cardiogenic shock; one or more demographic features comprising a feature indicative of the patient’s age; plurality laboratory test features comprising a feature indicative of the patient’s serum albumin level, a feature indicative of the patient’s blood urea nitrogen (BUN) level, a feature indicative of the patient’s white blood cell count, a feature indicative of the patient’s serums creatinine level, a feature indicative of the patient’s platelet count, and a feature indicative of patient hemoglobin level; and107one or more vital signs feature comprising a feature indicative of the patient's systolic blood pressure; and / or(iv) at least 6 of the following 8 features: a plurality of clinical history features comprising a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered cardiac arrest, a feature indicative of whether the patient has suffered cerebrovascular disease, and a feature indicative of whether the patient has suffered cardiogenic shock; one or more demographic features comprising a feature indicative of the patient’s age; and a plurality laboratory test features comprising a feature indicative of the patient’s serum albumin level, a feature indicative of the patient’s blood urea nitrogen (BUN) level, and a feature indicative of the patient’s white blood cell count wherein the at least 6 features include at least the clinical history features; and / or(v) a plurality of clinical history features comprising a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered cardiac arrest, a feature indicative of whether the patient has suffered cerebrovascular disease, and a feature indicative of whether the patient has suffered cardiogenic shock; one or more demographic features comprising a feature indicative of the patient’s age; and a plurality laboratory test features comprising a feature indicative of the patient’s serum albumin level, a feature indicative of the patient’s blood urea nitrogen (BUN) level, and a feature indicative of the patient’s white blood cell count.
10. The method of any preceding claim, wherein the second predetermined period of time is between 3 months and 12 months, between 3 months and 9 months, between 3 months and 6 months, or wherein the second predetermined period of time is 6 months.11 . The method of any preceding claim, wherein the plurality of predetermined features comprise a third set of features comprising: one or more hospital admission history features comprising a feature indicative of a number of emergency department visits made by the patient in a predetermined period of time prior to the predicting, and one or more laboratory test features comprising a feature indicative of the patient’s hemoglobin level and / or a feature indicative of the patient’s blood urea nitrogen (BUN) level; and wherein predicting, using the values of said plurality of features, a prognosis for the patient comprises using a machine learning model that has been trained to predict the risk of readmission at hospital within the first predetermined period of time after discharge of the STEMI patient using training data that comprises for each of a plurality of patients who have been treated for STEMI: (i)108the values of said third plurality of features and (ii) an indication of whether the STEMI patient was readmitted at hospital within the predetermined period of time after discharge.
12. The method according to claim 1 1 , wherein the third set of features further comprises: one or more of a plurality of laboratory test features selected from: a feature indicative of the patient’s serum creatinine level, a feature indicative of the patient’s blood high density lipoprotein (HDL) level, a feature indicative of the patient’s blood sodium level, a feature indicative of the patient’s serum albumin level; and / or one or more hospital admission history features comprising a feature indicative of the patient’s admission duration; and / or a plurality of clinical history features comprising a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered renal disease, a feature indicative of whether the patient has suffered chronic pulmonary disease.
13. The method according to claim 1 1 or claim 12, wherein the third set of features comprise:(i) a hospital admission history features comprising a feature indicative of a number of emergency department visits made by the patient in a predetermined period of time prior to the predicting, and a plurality of laboratory test features including a feature indicative of the patient’s serum creatinine level, and a feature indicative of the patient’s blood urea nitrogen (BUN); optionally wherein the plurality of laboratory test features further include a feature indicative of the patient’s blood high density lipoprotein (HDL) level, a feature indicative of the patient’s blood sodium level, and a feature indicative of the patient’s serum albumin level; and / or(ii) one or more hospital admission history feature comprising a feature indicative of a number of emergency department visits made by the patient in a predetermined period of time prior to the predicting; and a plurality of laboratory test features comprising a feature indicative of the patient’s hemoglobin level, a feature indicative of the patient’s serum creatinine level, a feature indicative of the patient’s blood urea nitrogen (BUN) level, a feature indicative of the patient’s blood high density lipoprotein (HDL) level, a feature indicative of the patient’s blood sodium level, and a feature indicative of the patient’s serum albumin level; and / or(iii) a plurality of hospital admission history features comprising: a feature indicative of a number of emergency department visits made by the patient in a predetermined period of time prior to the predicting, and a feature indicative of the patient’s admission duration; and a plurality of laboratory test features comprising a feature indicative of the patient’s hemoglobin level and a feature indicative of the patient’s blood urea nitrogen (BUN) level; and a plurality of clinical history features comprising a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered renal109disease, and a feature indicative of whether the patient has suffered chronic pulmonary disease; and / or(iv) at least 5 of the following 7 features: a plurality of hospital admission history features comprising: a feature indicative of a number of emergency department visits made by the patient in a predetermined period of time prior to the predicting, and a feature indicative of the patient’s admission duration; and a plurality of laboratory test features comprising a feature indicative of the patient’s hemoglobin level and a feature indicative of the patient’s blood urea nitrogen (BUN) level; and a plurality of clinical history features comprising a feature indicative of whether the patient has suffered congestive heart failure, a feature indicative of whether the patient has suffered renal disease, and a feature indicative of whether the patient has suffered chronic pulmonary disease; wherein the at least 5 features include at least the plurality of clinical history features.
14. The method according to any preceding claim, wherein the first predetermined period of time is between 10 and 60 days, between 20 and 40 days, between 20 and 30 days, between 2 and 6 weeks, between 2 weeks and 2 months, or 30 days.
15. The method according to any preceding claim, wherein each of the one or more machine learning models is individually selected from: decision trees, regularized and / or gradient boosted decision trees, random forest models, logistic regression models, and / or wherein each of the one or more machine learning models is a non-linear model, and / or wherein each of the one or more machine learning models is an ensemble model, and / or wherein each of the one or more machine learning models is a tree-based model.
16. The method according any preceding claim, wherein each of the one or more machine learning models has been trained to classify patients between a plurality of categories comprising a first category associated with a high risk of a post-treatment complication that the machine learning model has been trained to predict, and one or more further categories associated with lower risks of the post-treatment complication that the machine learning model has been trained to predict, optionally wherein the one or more further categories comprise a second category associated with a low risk of the post-treatment complication and a third category associated with a medium risk of the post-treatment complication.
17. The method of any preceding claim, wherein each of the one or more machine learning models has been trained to provide as output a probability of the patient experiencing a post-treatment complication, or wherein each of the one or more machine learning models has been trained to provide as output a probabilistic score indicative of the risk of the patient experiencing a posttreatment complication.11018. The method of claim 17, wherein each of the one or more machine learning models has been trained to classify patients between a plurality of categories associated with different risks of experiencing a post-treatment complication, wherein the predicting comprises classifying the patient between a plurality of categories by comparing the output of the machine learning model to a first predetermined threshold, wherein the patient is classified in a first, high risk category of experiencing a post-treatment complication when the output of the machine learning model is above the first threshold, and / or by comparing the output of the machine learning model to a second predetermined threshold, wherein the patient is classified in a second, low risk category of experiencing a post-treatment complication when the output of the machine learning model is below the second threshold.
19. The method of claim 18, wherein the first and second predetermined thresholds are thresholds that have been identified such that patients in a reference cohort, optionally the training patients, classified in the first, and second categories represent predetermined proportions of the reference cohort, optionally wherein the predetermined proportions may be proportions that are classified as low, and high risk, respectively, using a known risk classification score for prediction of the posttreatment complication.
20. The method of any preceding claim, further comprising outputting, to a user interface or computing device, a result of said predicting, optionally wherein the result includes a score output by the one or more machine learning models, or a classification between a plurality of categories associated with respective risks of experiencing the one or more post-treatment complications.
21. The method according to any preceding claim, further comprising selecting a treatment or monitoring plan in accordance with the results of said predicting and / or wherein the predicting comprises classifying the patient between a plurality of categories associated with respective risks of the patient experiencing a post-treatment complication, and wherein the method comprises selecting a first treatment and / or monitoring plan when the patient is classified in a first, high risk category, and selecting a second treatment and / or monitoring plan when the patient is classified in a second, low risk category .
22. A method of providing a tool for predicting the risk of a post-treatment complication for a STEMI patient, the method comprising: obtaining training data comprising, for each of a plurality of patients who have been treated for STEMI: (i) values of a plurality of predetermined features associated with the patient and (ii) corresponding indication of whether the patient has suffered from the post-treatment complication; and111training a machine learning model to predict the risk of a STEMI patient suffering from the post-treatment complication using said training data, wherein the machine learning model is trained to take as input the values of said plurality of features and to produce as output an indication of risk of a STEMI patient suffering from the post-treatment complication; wherein the predetermined features comprise: one or more patient demographic features, one or more hospital admission history features, one or more clinical history features, one or more vital signs features and / or one or more laboratory tests, and wherein the one or more post-treatment complications are selected from: readmission within a first predetermined period of time, in-hospital mortality, and mortality within a second predetermined period of time.
23. The method of claim 22, further comprising selecting, from a plurality of candidate features, the predetermined features for one or more ofthe machine learning models, said selecting comprising: determining the values of one or more feature importance metric, optionally a SHAP value, for each of said candidate features, ranking the set of candidate features according to the feature importance values; and identifying a subset of said candidate features using one or more model performance metrics including the Area Under the Receiver Operating Characteristic Curve (AUC) by iteratively including additional features of lower rank and evaluating said one or more model performance metrics, wherein the predetermined set of features are selected as features whose inclusion improves the AUC of the model.
24. The method of claim 22 or 23, wherein the method has the features of any of claims 2 to 21 .
25. The method of any of claims 1 to 21 , further comprising training the one or more models using the method of any of claims 22 to 24.
26. A system comprising: at least one processor; and at least one non-transitory computer readable medium containing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of any of claims 1 to 25.
27. A non-transitory computer readable medium or media string instructions that, when executed by a processor, cause the processor to perform the method of any of claims 1 to 25.112