Machine learning model for predicting a prognostic outcome of hypertrophic cardiomyopathy (HCM)
By processing multidimensional medical data through machine learning models, the problem of predicting disease progression and cardiovascular events in early-stage HCM patients has been solved, enabling accurate prognosis and personalized treatment guidance for HCM patients.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BRISTOL MYERS SQUIBB CO
- Filing Date
- 2024-11-22
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies are insufficient to effectively predict disease progression and cardiovascular events in early-stage hypertrophic cardiomyopathy (HCM) patients, especially due to a lack of understanding of the prognostic characteristics of asymptomatic or mildly symptomatic patients, making it difficult to assess the therapeutic effects of cardiac myosin inhibitors within an appropriate timeframe.
The machine learning (ML) medical prognostic model is trained by processing multidimensional medical data, including medical imaging data, cardiac measurement data, electrocardiogram data, laboratory test results, and genomic data, to predict the probability of disease progression and cardiovascular event risk in patients. Survival models, neural networks, and other technologies are used for prediction.
It enables accurate prediction of early disease progression and cardiovascular events in HCM patients, helps identify high-risk patients and guides personalized treatment decisions, and improves the ability to manage the course of disease in asymptomatic or mildly symptomatic patients.
Smart Images

Figure CN122249866A_ABST
Abstract
Description
Technical Field
[0001] This disclosure relates to machine learning (ML) models for predicting prognostic outcomes of hypertrophic cardiomyopathy (HCM). Background Technology
[0002] Hypertrophic cardiomyopathy (HCM) is a chronic heart disease that causes thickening and stiffening of the myocardium. HCM is one of the most common inherited heart diseases, with an estimated prevalence of 1 in 200 to 1 in 500 people. It is characterized by left ventricular hypertrophy (LVH) in the absence of another cardiac, systemic, or metabolic precipitant, such as hypertension. In hereditary HCM, the disease is inherited in an autosomal dominant pattern and is caused by mutations in sarcomere protein genes. In idiopathic HCM, the disease may not have a known genetic cause. Generally, HCM leads to structural changes in the heart, thereby affecting its function.
[0003] Cardiac myosin inhibitors have recently emerged as a new treatment option for HCM patients and are considered to have the potential to improve the disease. Trials investigating the efficacy of cardiac myosin inhibitors have primarily focused on symptomatic New York Heart Association (NYHA) class II and III HCM patients. Because cardiac myosin inhibitor treatment is disease-specific, it is expected to prevent asymptomatic patients from progressing to symptomatic disease. Summary of the Invention
[0004] One aspect of this disclosure provides a computer-implemented method that, when executed on data processing hardware, causes the data processing hardware to perform operations. The operations include: receiving multidimensional medical data of a patient from one or more sources; extracting features from the multidimensional medical data; and processing the extracted features using one or more machine learning (ML) medical prognostic models to predict the probability of the patient developing hypertrophic cardiomyopathy (HCM) before a threshold date. The one or more ML medical prognostic models are trained using a training process that includes: obtaining corresponding baseline training data for each of a plurality of patients, the baseline training data including baseline multidimensional medical data spanning multiple different modalities; and obtaining corresponding follow-up training data for each of the plurality of patients, the follow-up training data including follow-up multidimensional medical data collected after the baseline training data spanning multiple different modalities, and corresponding real clinical outcomes. The training process also includes training the one or more ML medical prognostic models based on the baseline training data and the follow-up training data to teach the one or more ML medical prognostic models how to predict the corresponding real clinical outcome.
[0005] Implementations of this disclosure may include one or more of the following optional features. In some examples, the one or more ML medical prognostic models include one or more models selected from the group consisting of: survival models, neural networks, convolutional neural networks (CNNs), attention-based neural networks, generative neural networks, autoencoders, variational autoencoders (VAEs), regression models, linear models, nonlinear models, support vector machines, decision tree models, random forest models, ensemble models, Bayesian models, naive Bayesian models, k-means models, k-nearest neighbor models, principal component analysis, Markov models, and any combination thereof. In some implementations, the regression model includes a multivariate survival model or a multivariate time response function (mTRF) model, wherein the mTRF model includes independent variables corresponding to the multidimensional medical data. In some implementations, the generative neural network includes a conditional generative adversarial network (cGAN), which includes one or more conditions corresponding to the multidimensional medical data.
[0006] In some implementations, the baseline multidimensional medical data is collected for the patient over a threshold time period starting from an indexed date corresponding to the time the patient was diagnosed with HCM. In some examples, the multidimensional medical data includes at least one of medical imaging data, cardiac measurement data, clinical data, electrocardiogram data, laboratory test results, genomic data, or functional test results. In some implementations, the predicted probability includes the predicted probability of the patient experiencing a cardiovascular event, which includes at least one of the following: cardiovascular-related hospitalization, new diagnosis of atrial fibrillation, heart failure requiring treatment, fatal ventricular arrhythmia leading to cardiac arrest, appropriate implantable cardioverter defibrillator shock, transient ischemic attack, stroke, death, acute myocardial infarction, worsening pVO2, or worsening LVOT pressure gradient.
[0007] In some cases, the patient has a New York Health Association (NYHA) level assessment of Grade I, and the predicted probability includes the probability that the patient's HCM will progress to NYHA level II or higher before the threshold date. In other cases, the patient has an early Grade II NYHA level assessment, and the predicted probability includes the probability that the patient's HCM will progress to NYHA level III or higher before the threshold date. In many more cases, the predicted probability includes the probability that the patient's HCM will change from non-obstructive HCM to obstructive HCM before the threshold date, or that at least one of the following will be initiated before the threshold date: HCM will be initiated before the threshold date.
[0008] In some implementations, the operation further includes: receiving follow-up multidimensional medical data of the patient, which was collected for the patient during follow-up visits; extracting follow-up features from the follow-up multidimensional medical data; and further processing the extracted features using one or more trained ML medical prognostic models to predict the probability of HCM progression before a threshold date. In other examples, the operation further includes determining that the predicted probability meets a threshold probability value, and selecting the patient for inclusion in a clinical trial based on determining that the predicted probability meets the threshold probability value. In still other examples, the operation further includes determining that the predicted probability meets the threshold probability value, and treating the patient with a cardiac myosin inhibitor based on determining that the predicted probability meets the threshold probability value.
[0009] Another aspect of this disclosure provides a computer-implemented method that, when executed on data processing hardware, causes the data processing hardware to perform operations. These operations include, for each specific patient among a plurality of patients diagnosed with hypertrophic cardiomyopathy (HCM) and meeting inclusion criteria: obtaining corresponding baseline training data, which includes baseline multidimensional medical data spanning multiple different modalities and collected for that specific patient within a threshold time period starting from the corresponding index date assigned to that specific patient; and obtaining corresponding follow-up training data, which includes follow-up multidimensional medical data spanning multiple different modalities and collected for that specific patient after the corresponding index date, and corresponding real clinical outcomes. The operations also include training one or more machine learning (ML) medical prognostic models based on the corresponding baseline training data and the corresponding follow-up training data to teach the one or more ML medical prognostic models to predict the corresponding real clinical outcome.
[0010] In some examples, the multidimensional medical data includes at least one of medical imaging data, cardiac measurement data, clinical data, electrocardiogram data, laboratory test results, genomic data, or functional test results. In some implementations, the predicted probability includes the predicted probability of the patient experiencing a cardiovascular event, which includes at least one of the following: cardiovascular-related hospitalization, new diagnosis of atrial fibrillation, heart failure requiring treatment, fatal ventricular arrhythmia leading to cardiac arrest, appropriate implantable cardioverter defibrillator shock, transient ischemic attack, stroke, death, acute myocardial infarction, worsening pVO2, or worsening LVOT pressure gradient.
[0011] In some cases, the patient has a New York Health Association (NYHA) level assessment of Grade I, and the predicted probability includes the probability that the patient's HCM will progress to NYHA level II or higher before the threshold date. In other cases, the patient has an early Grade II NYHA level assessment, and the predicted probability includes the probability that the patient's HCM will progress to NYHA level III or higher before the threshold date. In many more cases, the predicted probability includes the probability that the patient's HCM will change from non-obstructive HCM to obstructive HCM before the threshold date, or that at least one of the following will be initiated before the threshold date: HCM will be initiated before the threshold date.
[0012] In some implementations, training one or more ML medical prognostic models based on corresponding baseline training data and corresponding follow-up training data obtained for a specific patient includes, for each of the multiple different modalities: extracting baseline features associated with the corresponding modality from the corresponding baseline training data; extracting follow-up features associated with the corresponding modality from the corresponding follow-up training data; and using the baseline features and follow-up features associated with the corresponding modality to train the corresponding modality-specific ML medical prognostic model.
[0013] In some examples, for at least one specific patient among the plurality of patients: obtaining the corresponding baseline training data and the corresponding follow-up training data includes accessing a local storage device storing the corresponding baseline training data and the corresponding follow-up training data for the specific patient via federated data access technology, the local storage device being controlled by the owner of the corresponding baseline training data and the corresponding follow-up training data; and training the one or more ML medical prognostic models based on the corresponding baseline training data and the corresponding follow-up training data includes training the one or more prognostic models by locally processing the corresponding baseline training data and the corresponding follow-up training data accessed from the local storage device on a corresponding working node controlled by the owner of the corresponding baseline training data and the corresponding follow-up training data. Attached Figure Description
[0014] Figure 1 This is a schematic diagram of an example prognostic system used to predict the progression of hypertrophic cardiomyopathy (HCM) in patients.
[0015] Figure 2A and Figure 2B This is a schematic diagram of an example artificial neural network (ANN) used to predict the progression of HCM in patients.
[0016] Figure 3 This is a schematic diagram of an example convolutional neural network (CNN) used to predict the progression of HCM in patients.
[0017] Figure 4 This is a schematic diagram of an example generative adversarial network (GAN) used to predict the progression of HCM in patients.
[0018] Figure 5 This is a schematic diagram of an example variational autoencoder (VAE) used to predict the progression of HCM in patients.
[0019] Figure 6 This is a schematic diagram of an example converged architecture.
[0020] Figure 7 This is a schematic diagram of another example of a converged architecture.
[0021] Figure 8 This is a schematic diagram of an example training process for training an ML medical prognosis model used to predict the progression of HCM in patients.
[0022] Figure 9 This is a schematic diagram of another example training process used to train an ML medical prognosis model for predicting the progression of HCM in patients.
[0023] Figure 10 This is a schematic diagram of an example training process for training multiple ML medical prognostic models used to predict the progression of HCM in patients.
[0024] Figure 11 This is another example of a training process used to train multiple ML medical prognostic models for predicting the progression of HCM in patients.
[0025] Figure 12 This is a schematic diagram of an example data preprocessing process used to generate a training dataset for an ML medical prognosis model that predicts the progression of HCM in patients.
[0026] Figure 13 This is a schematic diagram of an example process for generating a medical prognostic model for predicting the progression of HCM in patients.
[0027] Figure 14 This is a flowchart of an example operational setup for a computer-implemented method for predicting the progression of hemorrhage (HCM) in patients.
[0028] Figure 15 This is a flowchart of an example operational setup for a computer-implemented method of training a ML medical prognostic diagnostic model for predicting the progression of HCM in patients.
[0029] Figures 16A to 16C This is a chart showing an example of a patient's extracted clinical outcomes.
[0030] Figure 17 This is a schematic diagram of an example computing device that can be used to implement the systems and methods described herein. Detailed Implementation
[0031] Patients with HCM often remain asymptomatic or have mild symptoms. The prevalence of symptomatic hypertrophic hypertrophy in the United States is estimated to be less than 1 in 3,000 adults. Patients are usually diagnosed during the symptomatic stage after presenting with symptoms such as shortness of breath, angina, fatigue, exercise intolerance, or presyncope and / or syncope during exercise. Detection of HCM in asymptomatic patients may occur during family screenings relying on genetic testing or through abnormal electrocardiograms obtained for other reasons, such as life insurance physicals, non-cardiac surgery, or procedures. Diagnosis is typically confirmed by echocardiography and / or cardiac magnetic resonance imaging.
[0032] The New York Heart Association (NYHA) classification is commonly used in routine clinical practice and trials to grade symptoms of exercise-induced heart failure (HF). The NYHA classification categorizes patients into one of four classes based on their limitations during physical activity. It ranges from Class I (corresponding to asymptomatic patients during daily physical activity) to Class IV (corresponding to severe deficits where patients experience symptoms even at rest).
[0033] Cardiac myosin inhibitors have recently emerged as a new treatment option for HCM patients and are considered to have the potential to improve the disease. Trials on the efficacy of cardiac myosin inhibitors have primarily focused on patients with symptomatic NYHA class II and III HCM. Because cardiac myosin inhibitor treatment is disease-specific, it is expected to prevent asymptomatic patients from progressing to symptomatic disease. However, testing the efficacy of cardiac myosin inhibitors in patients at the early stages of HCM (i.e., those classified as asymptomatic NYHA class I or II, or those exhibiting only mild symptoms in NYHA class II) remains challenging, as disease progression can be slow, making it impractical to measure treatment effectiveness based on progression to higher NYHA classes within an appropriate timeframe. Furthermore, HCM is a heterogeneous disease with poorly understood progression patterns. Additionally, the history of early-stage HCM is not well documented or understood. Moreover, few prognostic medical features have been identified for HCM, especially its early stages. Therefore, prognostic methods are needed to predict the progression of HCM in patients—that is, methods for determining the clinical outcomes or course of HCM in patients. As disclosed herein, predicting clinical outcomes or disease course may include medically predicting the probability of a specific progression of HCM occurring before a specific threshold date (e.g., within the next 3 years). That is, for example, disease progression from a first NYHA level assessment to a second NYHA level assessment higher than the first, or the occurrence of a cardiovascular event due to HCM. In some instances, predictive outcomes for multiple patients are used to determine inclusion and / or exclusion criteria for selecting patient populations for clinical studies.
[0034] The implementations described in this paper involve training machine learning (ML) medical prognostic models to teach them to predict symptom progression and cardiovascular events in patients diagnosed with HCM. The implementations also involve using such ML medical prognostic models to process patient medical data to infer or identify biomarkers for disease progression from asymptomatic or mildly symptomd stages to more severe stages, or to predict the likelihood of cardiovascular events. Notably, the medical data used to train the ML medical prognostic models or predict HCM progression may be unstructured data (e.g., raw images or raw measurement data). Advantageously, the ML medical prognostic models disclosed herein are interpretable, making them, for example, useful for generating medical findings that enable action by healthcare professionals, and / or for deriving clinically relevant criteria for guiding care of patients diagnosed with HCM and / or for selecting patients for clinical trials.
[0035] Example medical data obtained for patients may include, but is not limited to: clinical factors, medical imaging data (such as echocardiogram images), cardiac measurements (e.g., end-diastolic left atrial area, end-diastolic left ventricular area, atrioventricular coupling, end-diastolic left atrioventricular coupling index), cardiovascular magnetic resonance (CMR) images, electrocardiogram (ECG) measurements, laboratory test results, genomics and / or functional tests. The results output from these ML medical prognostic models can be used to identify patients at risk of disease progression, along with insights into the size of the relevant risk population, to assess the feasibility of trials for patients already identified as at risk. Specifically, ML medical prognostic models can be trained and used during inference to identify patients with HCM in an early stage, i.e., those classified as asymptomatic NYHA class I and those exhibiting only mild symptoms in NYHA class II, who are at risk of rapidly progressing to higher NYHA classes and therefore may benefit from treatment with cardiac myosin inhibitors. In other words, ML medical prognostic models can be trained to perform prognostic enhancement by identifying subgroups of HCM patients classified as asymptomatic NYHA class I or NYHA class II with only mild symptoms, who are likely to progress to more severe stages. Prognostic enrichment refers to selecting patients who are more likely to experience NYHA class progression or any other disease-related outcomes of interest.
[0036] Examples of predicted cardiovascular events include, but are not limited to: cardiovascular hospitalization, new diagnosis of atrial fibrillation (AF) (whether hospitalized or not), heart failure episode (whether hospitalization is required or not), time of transition from non-obstructive HCM (nHCM) to obstructive HCM (oHCM), initiation and / or escalation of HCM treatment (e.g., beta-blockers, verapamil, diltiazem and / or disopyramide), interventional procedures (device implantation, myocardiography, alcohol ablation, heart transplantation), fatal ventricular arrhythmia (VT / VF) episode leading to cardiac arrest (post-resuscitation), hospitalization and / or appropriate implantable cardioverter defibrillator (ICD) shock, transient ischemic attack (TIA) + stroke, death (all cardiovascular causes, including sudden cardiac death), acute myocardial infarction (MI), deterioration of exercise tolerance (e.g., pVO2), and deterioration of LVOT pressure gradient.
[0037] Figure 1This is a schematic diagram of an example of a system 100 used to medically predict the progression of hypertrophic cardiomyopathy (HCM) in patient 10 using multidimensional medical data 110, 110a-n (also referred to herein as multimodal medical data) collected for patient 10 and spanning multiple different modalities (i.e., different test or measurement data). The predicted progression may include one or more of the predicted clinical outcomes or course of HCM in patient 10 prior to a threshold date. For example, a clinical outcome or course may be the progression of HCM from the first NYHA level assessment to a second NYHA level assessment that is higher than the first NYHA level assessment (e.g., from NYHA I to NYHA II, from NYHA II to NYHA III, or from non-obstructive HCM (nHCM) to obstructive HCM (oHCM)), or the occurrence of a cardiovascular event due to HCM prior to a specific threshold date. For example, cardiovascular events may include: cardiovascular-related hospitalization, a new diagnosis of atrial fibrillation (AF), heart failure (HF) requiring treatment, a fatal ventricular arrhythmia leading to cardiac arrest, an appropriate implantable cardioverter defibrillator (ICD) shock, transient ischemic attack (TIA), stroke, death, acute myocardial infarction (MI), worsening of venous oxygen partial pressure (pVO2) or left ventricular outflow tract (LVOT) pressure gradient. Multidimensional medical data 110 used for the patient 10 may include, but are not limited to: clinical factors, medical imaging data (such as echocardiogram images), cardiovascular magnetic resonance (CMR) images, cardiac measurements (e.g., end-diastolic left atrial area, end-diastolic left ventricular area, atrioventricular coupling, end-diastolic left atrioventricular coupling index), electrocardiogram (ECG) measurements, laboratory test results, genomics and / or functional tests.
[0038] System 100 includes a computing device 20 (e.g., a computer, laptop computer, tablet computer, smartphone, local server, remote server, or server of a distributed system running in a cloud computing environment), configured to process multidimensional medical data 110 to medically predict the progression of HCM in patient 10. Here, multidimensional medical data 110 may be collected for patient 10 during a threshold time period following an index date corresponding to the time when patient 10 was diagnosed with HCM. In some examples, system 100 also receives multidimensional medical data 110 as follow-up multidimensional medical data for patient 10, which is collected for patient 10 during follow-up visits, extracting baseline features from multidimensional medical data 110 and follow-up features from the follow-up multidimensional medical data. In some examples, if the interval between the date of the most recent measurement or examination performed and the patient's index date is less than a threshold time, the performed measurement or examination may be used as baseline medical data. Otherwise, the measurement or examination may be considered missing. For example, the threshold time might be 3 months. Here, processing multidimensional medical data 110 to medically predict the progression of HCM in patient 10 includes using one or more trained ML medical prognostic models 200 to process baseline and follow-up features. In some implementations, patient 10 is selected for inclusion in a clinical trial or treatment with a cardiac myosin inhibitor is initiated when the predicted probability is determined to meet a threshold probability value.
[0039] The computing device 20 includes data processing hardware 22 and memory hardware 24 communicating with the data processing hardware 22. The memory hardware 24 stores machine- or computer-readable instructions that, when executed by the data processing hardware 22, cause the data processing hardware 22 to perform the operations disclosed herein to predict HCM progression in patient 10 or to train an ML medical prognostic model 200 for predicting HCM progression in patient 10. The computing device 20 can acquire multidimensional medical data 110 from any one or more data storage devices (not shown for clarity) via any one or more of any number and / or types of public and / or private communication networks 30. Alternatively, the computing device 20 can communicate directly with the data storage devices via any one or more of any number and / or types of wired or wireless digital communication interfaces (e.g., Bluetooth®, USB, etc.).
[0040] To medically predict the progression of HCM in patient 10, computational system 100 executes one or more trained ML medical prognostic models 200, 200a-n (also referred to herein as one or more models 200). The one or more models 200 can be trained using ML techniques. In some implementations, the one or more models 200 include one or more of the following: survival models, artificial neural networks (ANNs) (see Figure 2), convolutional neural networks (CNNs) (see Figure 3). Figure 3 Attention-based neural networks, generative neural networks (GANs) (including one or more conditions corresponding to multidimensional medical data modalities) (see...) Figure 4 ), autoencoder, variational autoencoder (VAE) (see Figure 5 Models 200 may include, or be comprised of, regression models (e.g., multivariate survival models or multivariate time response function (mTRF) models, which include independent variables corresponding to modalities of multidimensional medical data), linear models, nonlinear models, support vector machines, decision tree models, random forest models, ensemble models, Bayesian models, Naive Bayes models, k-means models, k-nearest neighbor models, principal component analysis, Markov models, or any combination thereof. One or more models 200 may be or include software and / or machine or computer-readable instructions stored on memory hardware (e.g., memory hardware 24) that, when executed by a processing unit (e.g., data processing hardware 22), cause the computing device 20 to predict the progression of HCM in patient 10.
[0041] In some implementations, one or more models 200 are trained using a training procedure 800 (e.g., see...). Figures 8 to 11 The training process includes obtaining baseline training data (which includes baseline multidimensional medical data 110 spanning multiple modalities) for each of the multiple patients, as well as corresponding follow-up training data, which includes follow-up multidimensional medical data 110 collected after the baseline training data across multiple modalities, and corresponding real clinical outcomes. The training process 800 uses the baseline training data and follow-up training data to train one or more models 200 to teach them how to predict the corresponding real clinical outcomes.
[0042] Figure 2A and Figure 2BThis is a schematic diagram of an example artificial neural network (ANN) model 200a used to predict the progression of HCM in patient 10. The ANN model 200a in Figure 2 is fully connected, such that each neuron or node in one layer of the ANN model 200a is connected to every neuron in the next layer of the ANN model 200a, and has feedforward connections with them. The connections in the ANN model 200a are unidirectional, from the input layer 204 to the output layer 206. The ANN model 200a has one input layer 204 and one output layer 206, with two or more hidden layers 208 and 210 between the input layer 204 and the output layer 206. Neural networks with more than one hidden layer are also called deep neural networks.
[0043] Each neuron in input layer 204 receives an input signal corresponding to an element in input vector 212. In the classifier neural network, each neuron in output layer 206 can represent a category. For example, if ANN model 200a is configured to distinguish between two disease outcome categories, such as diseased and not diseased, then two output neurons in output layer 206 can be used to represent these two disease outcomes. For another example, ANN model 200a can be configured to output, for example, 10 disease outcome severity levels. Here, 10 output neurons can be used to represent 10 disease severity levels. In these implementations, the final output of ANN model 200a is categorical. In some implementations, the activation function of the output neurons in these classification networks can be a SoftMax function, generating a probability for each neuron, where the sum of all probabilities of the output neurons is 1. In some implementations, their loss function can be a cross-entropy function. In other implementations, ANN model 200a is designed to output continuous variables. For example, it may be necessary to model multiple clinical events. In this type of implementation, the activation function of the output neuron can be a linear function, a sigmoid function, or another non-linear function. For example, the loss function can be the root mean square (RMS) error.
[0044] Figure 2B The neurons of ANN model 200a are shown. The inputs and outputs of neuron 214. Here, the output y of neuron 214 is determined by the input of its upstream neuron 216, which is weighted by the connection strength w 218 between the neurons. The weighted sum of the inputs from the upstream neuron 216 is combined with a bias term b to ensure that neuron 214 is activated at least to some extent. The weighted sum and bias b are then passed through the activation function f to provide the output y of neuron 214.
[0045] Figure 3This is a schematic diagram of an example convolutional neural network (CNN) 200b used to predict HCM progression in patient 10. Such as Figure 2A and Figure 2B The conventional ANN shown may not be optimal for detecting spatial relationships that are invariant in size, position, or orientation. CNNs may be better at capturing such spatial relationships and achieving size, position, and orientation invariance. In some implementations, CNN model 200b may also be more efficient than conventional ANNs in extracting features from temporal response curves with complex patterns and relationships between curves.
[0046] CNN model 200b includes a first convolutional layer 222 with a stride of 1 and n1 channels (for n1 feature maps) using a 5x5 kernel. Input layer 222 extracts spatial features from each 5x5 kernel and generates feature maps 224, 224a-n for each channel. CNN model 200b includes a first max-pooling layer 226 that uses a 2x2 kernel to reduce the feature maps 224 generated by input layer 222. CNN model 200b also includes a second convolutional layer 230 that uses a stride of 1 and n2 channels (for n2 feature maps). CNN model 200b also includes a second max-pooling layer 234 that uses a 2x2 kernel to reduce the feature maps 232, 232a-n generated by second convolutional layer 230. The further scaled-down feature map is then flattened to provide vector 236, which becomes the input to the first fully connected layer 200a, which can be similar to... Figure 2A The ANN model 200a. The output 240 of the first fully connected layer 200a becomes the input of the second fully connected layer 200a, which can also be similar to the ANN model 200a.
[0047] Figure 4 This is a schematic diagram of an example generative adversarial network (GAN) model 200c used to predict the progression of HCM in patient 10. Here, GAN model 200c may include one or more conditions corresponding to the modalities of multidimensional medical data 110. In some implementations, GAN model 200c is configured to output a clinical progression curve indicating the severity of the disease at different times. In some implementations, the clinical progression curve values may be discretized to represent different disease states, such as NYHA grade IV. In some examples, the clinical progression curve is obtained empirically by repeatedly measuring at least one clinical outcome over a period of time. The clinical progression curve can then be used to train one or more models 200.
[0048] GAN model 200c includes a generator neural network 250 that generates data 252 that simulates real data, such as images, data, or clinical progression curves. The generator neural network 250 uses random noise as input 254, which provides a randomization mechanism to the generated data 252, making the generated data 252 similar to, but not identical to, real data 262. The data 252 generated by the generator neural network 250 (e.g., generated image 252) and real data 262 (e.g., real image 262) can be used as labeled data to train a discriminator neural network 270. The discriminator neural network 270 can be trained to improve its ability to distinguish between real image 262 and generated image 252, and its loss function 272 is configured to improve the discriminator's ability to distinguish between real image 262 and generated image 252. A complementary loss function 274 is used to train the generator neural network 250. Because the generator neural network 250 initially performs poorly in generating images 252 similar to real images 262, the discriminator neural network 270 is penalized less, while the generator neural network 250 is penalized more. As the generator neural network 250 improves, the discriminator neural network 270 is penalized more, while the generator neural network 250 is penalized less. The GAN model 200c typically stabilizes when the discriminator neural network 270 becomes stable, which may vary depending on the scenario. In a non-restricted example, the GAN model 200c stabilizes when the discriminator neural network 270 can accurately distinguish approximately 50% of the time. In the GNN model 200c, each neuron in the final output layer can represent a value in the output data space (e.g., pixel intensity in an image).
[0049] In some implementations, the GAN model 200c includes a conditional generative adversarial network (cGAN) model that enables a generator neural network 250 to generate images 252 that satisfy certain conditions, such as the gender of a patient in the image. In some implementations, this conditionalization can be achieved by labeling training samples according to conditions and encoding these conditions into one or more variables in the input of the generator neural network 250. For example, medical data can be categorized into different conditions, and training data can also be partitioned according to different conditions. The input of the cGAN model to the generator neural network 250 can include one or more features that encode the conditions of the training data. During data generation, by specifying input features corresponding to a certain medical condition, the generator neural network 250 can generate a time response curve that satisfies that condition.
[0050] Figure 5This is a schematic diagram of an example variational autoencoder (VAE) model 200d used to predict HCM progression in patient 10. VAE model 200d is a neural network model that includes an encoder network 280 for encoding inputs into latent variables in a latent space 290 and a decoder network 282 for decoding and reconstructing data from the latent space 290. In some examples, VAE model 200d is an unsupervised learning model that can extract dimensionality-reduced features because it can be trained using unlabeled data by providing a loss function through comparing unencoded inputs with decoded outputs. VAE model 200d can also be used to generate data that is similar to but different from its training data. This difference may stem from random sampling of latent variables.
[0051] As shown in the figure, the VAE model 200d may optionally include a convolutional layer 284, a multi-layer encoder portion 280, a hidden layer 286, and a multi-layer decoder portion 282 at the input side. The VAE model 200d is configured to receive input data 288, such as time response curves obtained from test samples. In some implementations, the time response curve is implemented as a clinical progression curve. In some implementations, unimodal or multimodal medical data is provided as input 288 to the VAE model 200d. Optionally, the input data 288 is organized and provided to the convolutional layer 284, which is configured to extract latent relevant features from the data. The VAE model 200d is configured such that the input data 288 filtered by the convolutional layer 284 is processed by the encoder layer 280 and decoded by the decoder layer 282. A hidden layer 286 is located between the encoder 280 and the decoder 282, and this hidden layer is configured to store the fully encoded data in a latent space 290.
[0052] In the example shown, hidden layer 286 stores a multidimensional latent space representation 290 of the fully encoded data. The latent space representation 290 comprises multiple data points, each associated with a specific sample or a specific reading taken from a sample. In a conventional autoencoder, the data in the hidden layer is static. However, in VAE model 200d, the data in hidden layer 286 is associated with random noise sampled from a distribution such as a Gaussian distribution, thus providing random variation to VAE model 200d. Consequently, the encoded data and the decoded data may differ, allowing the generation of new data that is similar to but different from the training data. Therefore, VAEs can be used as generative models in some implementations.
[0053] Each data point in the latent space 290 includes a feature vector with fewer dimensions than the input and output data of the VAE model 200d. Therefore, VAEs can also be used to extract features, reduce dimensionality, and / or reduce noise. The extracted low-dimensional features can be used as input to other ML models, such as ANNs or regression models.
[0054] Training the VAE model 200d can employ a loss function and / or other techniques that project the input data into the latent space probabilistically. In some implementations, the loss function uses a regularization term that leverages the Kuhl-Beck divergence. The feature extractor projects the data as a distribution of values along the axes in the latent space 290, rather than discrete values. Features of the distribution can include, for example, their central tendency (mean, median, etc.) and / or their variance in the latent space 290. Training can lead to a learned distribution (in the latent space 290) that resembles the true prior distribution (input data).
[0055] In some implementations, one or more models 200 include one or more survival models. A survival model is a model that analyzes data on the timing of events. The primary focus of a survival model is predicting the time before an event of interest occurs. Events in a survival model are typically endpoints, such as death, disease progression, disease relapse, or component failure. However, an event can be any event of time interest. A unique aspect of a survival model is its survival function, which handles censored data. Censoring occurs when some subjects do not experience an event during the study period or when subjects cannot be followed up. The exact time of the event for these subjects is unknown, posing a significant challenge to the analysis. The survival function of a survival model represents the probability of survival (or the probability that the event will not occur) within a given timeframe. The survival function is a key output of the survival model. In some implementations, the survival model provides an output of the probability of disease progression from one stage to another. The hazard function of a survival model describes the instantaneous risk of an event occurring at a given time, assuming that the event has not yet occurred. The hazard function of a survival model helps in understanding how risk factors influence the likelihood of the event occurring over time.
[0056] In some implementations, survival models include the Kaplan-Meer estimator model, a nonparametric statistic used to estimate survival functions based on lifetime data. The Kaplan-Meer model is used to construct survival curves for patient cohorts based on a combined classification of genetic, clinical, and echocardiographic markers. This nonparametric estimator is suitable for determining the probability of HCM progression over time within the cohort. The generated survival curves provide a visual representation of the event occurrence time data, showing the likelihood of disease progression or the occurrence of a major cardiac event within a specific time interval. The Kaplan-Meer estimator is particularly adept at handling censored data, a common challenge in medical research, as patients may not be able to be followed up, or the event of interest may not have occurred by the end of the study. Alternatively, survival models can include the Cox proportional hazards model, a semiparametric model widely used in medical research. The Cox proportional hazards model models hazard as a function of several covariates. Alternatively, survival models can include parametric survival models, which incorporate features such as exponential models, Weibull models, log-normal models, etc., and assume that survival times follow a specific distribution.
[0057] In some implementations, one or more models 200 include one or more multivariate time response function (mTRF) models. Here, the mTRF model may include independent variables corresponding to the modalities of the multidimensional medical data. The mTRF model can be configured to provide a clinical progression curve as output, where the time response function (TRF) values correspond to the disease severity at different times. In some examples, the TRF values are discretized to represent disease states, such as NYHA grade IV. In some implementations, the mTRF model is used to predict the disease progression curve. The mTRF model can be trained to receive multidimensional medical data 110 of patient 10 as input and predict the clinical progression curve of patient 10, where the TRF function values correspond to the disease severity at different times.
[0058] TRF is a univariate regression model where n time-response functions are recorded by N channels, each representing a time point of interest. TRF assumes that the instantaneous response r(t, n) sampled at times t = 1, 2, ..., T on channel n is provided by the convolution of the stimulus attribute s(t) with an unknown channel-specific TRF w(τ, n). This response can be represented in discrete time as: Where ε(t,n) is the residual response at each channel n, which cannot be interpreted by the response. The TRF can be viewed as a filter describing the linear transformation from a sustained stimulus (e.g., a medical data variable) to a sustained channel response. The TRF w(τ, n) describes this transformation from stimulus s to response r over a specific time lag range τ occurring instantaneously with respect to the stimulus feature s(t). The time lag range τ used to calculate w(τ, n) might be the range commonly used to capture a patient's response to a medical variable, such as the range determined through empirical medical observation. The TRF w(τ, n) can be calculated, for example, by minimizing the difference between the actual time response curve r(t, n) and the curve predicted by the convolution. The mean squared error (MSE) between them is used for estimation, as shown below: Here, the univariate TRF receives a single stimulus variable and provides a clinical progression curve for each of the N channels. If multiple stimuli are encoded as multiple independent variables, an mTRF model can be applied.
[0059] Figure 6 This is a schematic diagram of example fusion architectures 600, 600a, which includes a fusion unit 601 that fuses features 602, 602a-n extracted by multiple different feature extractors 604, 604a-n corresponding to different data modalities 606, 606a-n into a set of fused multimodal features 608. Here, one or more models 200 process this set of fused multimodal features 608 to predict HCM progression in patient 10. In some examples, fusion unit 601 uses a weighted sum (e.g., a linear sum) of features 602 to fuse features 602.
[0060] Figure 7 This is a schematic diagram of another example of the converged architecture 600, 600b. Figure 7 In the example shown, each extracted feature 602 is processed by the corresponding model 200, and then the predictions 610, 610a-n generated by model 200 are fused by fusion unit 612. In some examples, fusion unit 612 uses a weighted sum (e.g., a linear sum) of predictions 610 to fuse predictions 610.
[0061] One or more models 200 can be trained using any number and / or type of training methods and processes, such as unsupervised, self-supervised, semi-supervised, and supervised training methods. In some implementations, models 200 (such as VAEs) are trained in a semi-supervised manner, which uses both labeled and unlabeled training data simultaneously. An example of a semi-supervised training technique is described in the 2021 paper “A Survey of Deep Semi-Supervised Learning” (http: / / arxiv.org / abs / 2103.00550), published by Yang, X., Song, Z., King, I., and Xu, Z., which is incorporated herein by reference in its entirety. In some implementations, models 200 are trained in one or more iterations, and in practice, multiple individual models 200 may be employed, with some models serving as the basis for transfer learning for subsequent improvements or versions of the model. In some implementations, the feature extractor is trained partly using supervised learning and partly using unsupervised learning.
[0062] In some implementations, techniques such as model validation, cross-validation, and bootstrap methods are used to assess the reliability of predictions and prevent overfitting, a common problem in predictive modeling. Alternatively, multiple training data sources can be used to learn in multiple stages using mechanisms such as transfer learning. Transfer learning is a training process that starts with a previously trained model and adopts its architecture and current parameter values (e.g., previously trained weights and biases), but then changes the model's parameter values to reflect new or different training data. In various implementations, the architecture of the original model (including convolutional windows, if present) and optionally its hyperparameters remain unchanged during further training, such as via transfer learning. In some examples, one or more training procedures produce a first, preliminary model 200 that has been trained. Once fully trained using the training data, the preliminary model 200 can be used as a starting point, for example, to train a second model 200. The training of the second model 200 can begin with a model using the architecture and parameter settings of the first trained model 200, but by incorporating information from additional training data to refine the parameter settings.
[0063] Alternatively, the training of one or more models 200 can be performed in two phases. During the first phase, one or more models 200 can be trained using survival models to predict the risk of class progression (or other endpoints of interest). Example survival models may include Cox regression models, tree-based models, and deep neural network-based models using different modalities. Nested cross-validation (NCV) can then be applied to select one or more optimal models 200 based on both discriminative and calibration performance. During the second phase, a calibration step can be applied to transform the outputs of the models 200 into interpretable results, such as the probability of NYHA class progression within three years.
[0064] Figure 8 This is a schematic diagram of an example training process 800, 800a for training a model 200 to predict the progression of HCM in patient 10. Training process 800a can be executed on computing device 20 (i.e., data processing hardware 22) or on the data processing hardware of another computing system (such as a remote physical server or virtual server). In the illustrated example, training process 800a uses a training dataset 810 to train one or more models 200, which includes multiple training samples 812, 812a-n. Here, each specific training sample 812 includes: corresponding baseline training data 814, which includes baseline multidimensional medical data 110 spanning multiple different modalities; corresponding follow-up training data 816, which includes follow-up multidimensional medical data 110 collected after the baseline training data spanning multiple different modalities; and corresponding real clinical outcomes 818.
[0065] For each specific training sample 812 in the training dataset 810, the training process 800a uses model 200 to process the corresponding baseline training data 814 and follow-up multidimensional medical data 816 to generate a predicted prognosis 202, such as the probability of a patient developing hypertrophic cardiomyopathy (HCM) before a threshold date. The loss term module 820 determines the loss 822 based on the corresponding real clinical outcome 818 and the predicted prognosis 202.
[0066] Subsequently, training process 800a trains model 200 based on loss 822 to teach the ML medical prognosis model 200 how to predict the corresponding real clinical outcomes 818. In some cases, training process 800a trains model 200 by adjusting, regulating, updating, fine-tuning, etc., one or more parameters or weights of model 200 based on loss 822.
[0067] In some examples, training dataset 810 is hosted on the site of the owner of multidimensional medical data 110 within training dataset 810, which may include hospitals, educational institutions, or other sources. Without exposing multidimensional medical data 110, training dataset 810 can be accessed via federated data access, and the on-site worker nodes (not shown for clarity) of the owner of training dataset 810 can use the corresponding training dataset 810 to train model 200. In this way, model 200 can be trained by worker nodes associated with different sources of training data 810, ensuring that training data 810 is never shared or interpreted, but rather that model 200 is trained only during multiple cycles, each cycle corresponding to a different worker node. Federated data access and federated learning techniques for training predictive models are described in PCT patent application No. PCT / US2021 / 061417, filed December 1, 2021, which claims priority to European patent application No. 20306478.7, filed December 1, 2020. The published texts of these prior applications are considered part of the published text of this application, and are incorporated herein by reference in their entirety.
[0068] Figure 9 This is a schematic diagram of another example training process 800, 800b for training a model 200 used to predict HCM progression in patient 10. Training process 800b can be executed on computing device 20 (i.e., data processing hardware 22) or on the data processing hardware of another computing system (such as a remote physical server or virtual server). Training process 800b splits multidimensional medical data 110 from multiple patients 10 into a training dataset 912 and a test dataset 914. In some examples, training dataset 912 is augmented with semi-synthetic training data 916 to form training data 918. Training data 918 is then used, for example, using... Figure 8 Training process 800a is used to train the ML medical prognosis model 200. Subsequently, training process 800b uses the test dataset 914 to test the performance of the trained ML medical prognosis model 200. In some examples, training process 800b trains model 200 by adjusting, regulating, updating, fine-tuning, etc., one or more parameters or weights of model 200 based on the calculated loss.
[0069] Figure 10 This is a schematic diagram of another example training process 800, 800c for training multiple models 200 to predict HCM progression in patient 10. Training process 800c can be executed on computing device 20 (i.e., data processing hardware 22) or on the data processing hardware of another computing system (such as a remote physical server or virtual server). Figure 10In the example shown, for each modality 1002, 1002a-n of the training dataset for ML-ready feature 1004, the process is repeated. Figure 8 The training process 800a trains the corresponding model 200 for each mode 1002. In some examples, the training process 800c trains the model 200 by adjusting, regulating, updating, fine-tuning, etc., one or more parameters or weights of the model 200 based on the calculated loss.
[0070] Figure 11 This is a schematic diagram of yet another example training process 800, 800d for training model 200 to predict HCM progression in patient 10. Training process 800d can be executed on computing device 20 (i.e., data processing hardware 22) or on the data processing hardware of another computing system (such as a remote physical server or virtual server). Figure 11 In the example shown, the ML medical prognosis model 200 is trained using nested cross-validation, where the intermediate model 1104 uses... Figure 8 Training is performed using training process 800a. In some examples, training process 800d trains model 200 by adjusting, regulating, updating, fine-tuning, etc., one or more parameters or weights of model 200 based on the calculated loss.
[0071] Figure 12 This is a schematic diagram of an example data preprocessing procedure 1200 used to generate the training dataset for ML-ready features 1202, which can be used as... Figure 8 The training dataset 810 Figure 9 Data 912 and 914 Figure 10 ML-ready features of the training dataset of 1004 and / or Figure 11 The training dataset for ML-ready features 1102. Process 1200 extracts features for each modality based on modality segmentation of multidimensional medical data 110, extracts clinical outcomes from clinical data, and stores the extracted features in the training dataset 1202 of ML-ready features.
[0072] exist Figure 12 In the example shown, patients 10 can be selected for inclusion in the multidimensional medical training data 110 based on one or more inclusion and / or exclusion criteria. Example inclusion criteria include, but are not limited to: ● Adult patients diagnosed with HCM (e.g., LVH confirmed by echocardiography or CMR for reasons other than other causes). ● Patients with baseline NYHA class I or “early” NYHA class II who have had at least one follow-up visit, which includes the patient’s NYHA class (which, according to AHA / ACC criteria, can be class I or higher). Here, “early” NYHA class II is defined as NYHA class II patients with mild symptoms who are potentially only able to be treated with a maximum of one basic medication (a beta-blocker or the calcium channel blocker disopyramide), and No history of atrial fibrillation, and Resting or induced LVOT <50 mmHg, and NT-pro-BNP <300 pg / ml, and LAVi <35 ml / m2 (if available), and E / e' < 14 (if available) ● Patients must have data available at baseline and longitudinal follow-up (clinical, imaging, biological, omics, and / or exercise testing). ● Patients visiting primary or secondary cardiology departments ● Follow-up for at least 3 years Examples of inclusion criteria include, but are not limited to, patients with non-sarcoid HCM (e.g., any phenotypic disease), such as amyloidosis, hemochromatosis, Fabry disease, left ventricular hypertrophy (LVH) in aortic stenosis or exercise-induced cardiac syndrome, or hypertension.
[0073] During a patient's follow-up period, his or her NYHA level can be assessed several times, resulting in a time series of level assessments that can be used to deduce the progression time available for training one or more models 200. For ease of reading, a progression from level I to level II can be assessed; however, the same technique applies to progression from an "early" NYHA level II to a higher NYHA level.
[0074] Here, the progression time can be defined as the time between the baseline (index date) and the date when progression is first observed during patient follow-up. If no progression occurs during follow-up, the patient is considered a censored case. The censoring time is set to the date of the patient's last visit. To increase the size of the training set, synthetic patients can be generated using data collected from patient follow-up. For example, each follow-up date after the index date where the level has not progressed can be labeled as a synthetic observation to be used as a new index date for the synthetic patient.
[0075] In some cases, a patient's HCM may be assessed as shifting back and forth between two NYHA levels, such as... Figure 16A As shown. In such examples, such as Figure 16B As shown, patient 10 could be considered to have reached the higher of the two NYHA categories when they were first assessed. Alternatively, as... Figure 16C As shown, one or more synthetic patients can be created using the multidimensional data 110 of the patients, corresponding to various times when the patients 10 were rated at a higher NYHA level. Such synthetic patients can be used as additional training samples for training one or more models 200.
[0076] Figure 13 This is a schematic diagram of an example process 1300 for generating and selecting an ML medical prognosis model 200 for predicting HCM progression in patient 10.
[0077] Figure 14 This is a flowchart illustrating an example operational setup of a computer-implemented method 1400 for predicting HCM progression in patient 10. The operation can be performed by data processing hardware 1710 (…). Figure 17 (For example, the data processing hardware 22 of computing device 20) executes based on instructions stored on memory hardware 1720 (e.g., memory hardware 24 of computing device 20). Method 1400 can be implemented in many other ways. For example, the execution order of operations may be changed, and / or one or more operations and / or interactions may be altered, eliminated, subdivided, or merged. Additionally, Figure 14 The operations can be performed sequentially and / or in parallel, for example, by separate processing threads, processors, devices, discrete logic, circuits, etc.
[0078] At step 1402, method 1400 includes receiving multidimensional medical data 110 of patient 10 from one or more sources. At step 1404, method 1400 includes extracting baseline features from the multidimensional medical data 110. At step 1406, method 1400 includes processing the extracted features using one or more ML medical prognostic models 200 to predict the probability of HCM progression in patient 10 before a threshold date. Here, the one or more ML medical prognostic models 200 are trained using a training process 800, which includes: obtaining corresponding baseline training data for each of the plurality of patients 10, the baseline training data including baseline multidimensional medical data 110 spanning multiple different modalities; and obtaining corresponding follow-up training data for each of the plurality of patients 10, the follow-up training data including follow-up multidimensional medical data 110 collected after the baseline training data spanning multiple different modalities, and corresponding real clinical outcomes.
[0079] Figure 15 This is a flowchart illustrating an example operational setup for a computer-implemented method 1500 for training a ML medical prognostic diagnostic model to predict the progression of HCM in patient 10. The operation can be performed by data processing hardware 1710 (…). Figure 17 (For example, the data processing hardware 22 of computing device 20) executes based on instructions stored on memory hardware 1720 (e.g., memory hardware 24 of computing device 20). Method 1500 can be implemented in many other ways. For example, the execution order of operations may be changed, and / or one or more operations and / or interactions may be altered, eliminated, subdivided, or merged. Additionally, Figure 15 The operations can be performed sequentially and / or in parallel, for example, by separate processing threads, processors, devices, discrete logic, circuits, etc.
[0080] At operation 1502, method 1500 includes obtaining corresponding baseline training data for each specific patient 10 among a plurality of patients 10 diagnosed with HCM and meeting inclusion criteria. This baseline training data includes baseline multidimensional medical data 110 collected for the specific patient 10 across multiple different modalities within a threshold time period starting from the corresponding index date assigned to the specific patient 10. At operation 1504, method 1500 includes obtaining corresponding follow-up training data for each specific patient 10 among a plurality of patients 10 diagnosed with HCM and meeting inclusion criteria. This follow-up training data includes follow-up multidimensional medical data 110 collected for the specific patient 10 across multiple different modalities after the corresponding index date, and corresponding real clinical outcomes. At operation 1506, method 1500 includes training the one or more ML medical prognostic models 200 based on the corresponding baseline training data and the corresponding follow-up training data to teach the one or more ML medical prognostic models 200 to predict the corresponding real clinical outcome.
[0081] Figure 17 This is a schematic diagram of an example computing device 1700 that can be used to implement the systems and methods described in this document. The computing device 1700 is intended to represent various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. The components shown herein, their connections and relationships, and their functions are merely exemplary and are not intended to limit the implementations of the invention described and / or claimed in this document.
[0082] The computing device 1700 includes a processor 1710 (i.e., data processing hardware) for implementing data processing hardware 22, a memory 1720 (i.e., memory hardware) for implementing memory hardware 24, a storage device 1730 (i.e., memory hardware) for implementing memory hardware 24 or storage model 200, ML-ready features, and training data, a high-speed interface / controller 1740 connected to memory 1720 and high-speed expansion port 1750, and a low-speed interface / controller 1760 connected to low-speed bus 1770 and storage device 1730. Each of components 1710, 1720, 1730, 1740, 1750, and 1760 is interconnected using various buses and can be mounted on a common motherboard or otherwise mounted where appropriate. Processor 1710 can process instructions for execution within computing device 1700, including instructions stored in memory 1720 or on storage device 1730, to display graphical information for a graphical user interface (GUI) on an external input / output device, such as display 1780 coupled to high-speed interface 1740. In other implementations, multiple processors and / or multiple buses may be used with multiple memories and various types of memory, where appropriate. Furthermore, multiple computing devices 1700 may be connected, each providing a portion of the necessary operation (e.g., as a server rack, a group of blade servers, or a multiprocessor system).
[0083] Memory 1720 stores information non-temporarily within computing device 1700. Memory 1720 may be a computer-readable medium, one or more volatile memory cells, or one or more non-volatile memory cells. Non-temporarily stored memory 1720 may be a physical means for storing programs (e.g., instruction sequences) or data (e.g., program state information) on a temporary or permanent basis for use by computing device 1700. Examples of non-volatile memory include, but are not limited to, flash memory and read-only memory (ROM) / programmable read-only memory (PROM) / erasable programmable read-only memory (EPROM) / electrically erasable programmable read-only memory (EEPROM) (e.g., commonly used in firmware, such as boot programs). Examples of volatile memory include, but are not limited to, random access memory (RAM), dynamic random access memory (DRAM), static random access memory (SRAM), phase-change memory (PCM), and disks or tapes.
[0084] Storage device 1730 provides mass storage for computing device 1700. In some implementations, storage device 1730 is a computer-readable medium. In various implementations, storage device 1730 may be a floppy disk device, hard disk device, optical disk device, magnetic tape device, flash memory or other similar solid-state memory device, or device array (including devices in a storage area network or other configuration). In additional implementations, a computer program product is tangibly embodied in an information carrier. The computer program product contains instructions that, when executed, perform one or more methods, such as those described above. The information carrier is a computer-readable or machine-readable medium, such as memory 1720, storage device 1730, or memory on processor 1710.
[0085] High-speed controller 1740 manages bandwidth-intensive operations of computing device 1700, while low-speed controller 1760 manages lower bandwidth-intensive operations. This allocation of responsibilities is merely exemplary. In some implementations, high-speed controller 1740 is coupled to memory 1720, display 1780 (e.g., via a graphics processor or accelerator), and high-speed expansion port 1750, which can accept various expansion cards (not shown). In some implementations, low-speed controller 1760 is coupled to storage device 1730 and low-speed expansion port 1790. Low-speed expansion port 1790, which may include various communication ports (e.g., USB, Bluetooth, Ethernet, Wireless Ethernet), may be coupled to one or more input / output devices, such as keyboards, clicking devices, scanners, or networking devices (such as switches or routers), for example, via a network adapter.
[0086] As shown in the accompanying drawings, the computing device 1700 can be implemented in a variety of different forms. For example, it can be implemented as a standard server 1700a or multiple times in a set of such servers 1700a, as a laptop computer 1700b, or as part of a rack server system 1700c.
[0087] The various implementations of the systems and techniques described herein can be implemented in digital electronic and / or optical circuit systems, integrated circuit systems, specially designed ASICs (Application-Specific Integrated Circuits), computer hardware, firmware, software, and / or combinations thereof. These different implementations can include implementations in one or more computer programs executable and / or interpretable on a programmable system, said programmable system including at least one programmable processor, which may be dedicated or general-purpose, coupled to receive and transmit data and instructions from a storage system, at least one input device, and at least one output device.
[0088] A software application (i.e., a software resource) can refer to computer software that causes a computing device to perform a task. In some instances, a software application may be referred to as an "application," "app," or "program." Example applications include, but are not limited to, system diagnostic applications, system management applications, system maintenance applications, word processing applications, spreadsheet applications, messaging applications, media streaming applications, social networking applications, and game applications.
[0089] These computer programs (also referred to as programs, software, software applications, or code) include machine instructions for a programmable processor and can be implemented using high-level procedural and / or object-oriented programming languages and / or assembly / machine languages. As used herein, the terms “machine-readable medium” and “computer-readable medium” refer to any computer program product, non-transitory computer-readable medium, device, and / or apparatus (e.g., disk, optical disk, memory, programmable logic device (PLD)) used to provide machine instructions and / or data to a programmable processor, including machine-readable media that receive machine instructions as machine-readable signals. The term “machine-readable signal” refers to any signal used to provide machine instructions and / or data to a programmable processor.
[0090] The processes and logical flows described in this specification can be executed by one or more programmable processors (also known as data processing hardware) that execute one or more computer programs to perform functions by manipulating input data and generating output. These processes and logical flows can also be executed by special-purpose logic circuit systems (e.g., FPGAs (Field-Programmable Gate Arrays) or ASICs (Application-Specific Integrated Circuits)). By way of example, processors suitable for executing computer programs include general-purpose microprocessors and special-purpose microprocessors, as well as any one or more processors of any kind of digital computer. Typically, a processor receives instructions and data from read-only memory or random access memory, or both. The basic elements of a computer are a processor for executing instructions and one or more memory devices for storing instructions and data. Typically, a computer will also include one or more mass storage devices (e.g., magnetic disks, magneto-optical disks, or optical disks) for storing data, or operatively coupled to receive data from or transfer data to one or more mass storage devices, or both. However, a computer need not have such devices. Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media, and memory devices, including, by way of example, semiconductor memory devices (e.g., EPROM, EEPROM, and flash memory devices), magnetic disks (e.g., internal hard disks or removable disks), magneto-optical disks, and CD-ROM and DVD-ROM disks. The processor and memory may be supplemented or integrated therein by a dedicated logic circuit system.
[0091] To provide interaction with a user, one or more aspects of this disclosure can be implemented on a computer having a display device for displaying information to the user (e.g., a CRT (cathode ray tube), LCD (liquid crystal display) monitor, or touchscreen), and optionally a keyboard and pointing device (e.g., a mouse or trackball) for the user to provide input to the computer. Other types of devices can also be used to provide interaction with the user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic input, voice input, or tactile input. Additionally, the computer can interact with the user by sending documents to and receiving documents from the device used by the user (e.g., sending web pages to a web browser in response to a request received from a web browser on the user's client device).
[0092] Unless otherwise expressly specified, the phrase “at least one of A, B, or C” is intended to refer to any combination or subset of A, B, and C, such as: (1) only at least one A; (2) only at least one B; (3) only at least one C; (4) at least one A and at least one B; (5) at least one A and at least one C; (6) at least one B and at least one C; and (7) at least one A, at least one B, and at least one C. Furthermore, unless otherwise expressly specified, the phrase “at least one of A, B, and C” is intended to refer to any combination or subset of A, B, and C, such as: (1) only at least one A; (2) only at least one B; (3) only at least one C; (4) at least one A and at least one B; (5) at least one A and at least one C; (6) at least one B and at least one C; and (7) at least one A, at least one B, and at least one C. Furthermore, unless otherwise expressly specified, “A or B” is intended to refer to any combination of A and B, such as: (1) only A; (2) only B; and (3) A and B.
[0093] Various implementations have been described. However, it should be understood that various modifications can be made without departing from the spirit and scope of this disclosure. Therefore, other implementations are within the scope of the following claims.
Claims
1. A computer-implemented method (1400), which, when executed on data processing hardware (1710), causes the data processing hardware (1710) to perform operations, said operations including: Receive multidimensional medical data (110) of patients (10) from one or more sources. Features are extracted from the multidimensional medical data (110); as well as One or more machine learning (ML) medical prognostic models (200) are used to process the extracted features to predict the probability (202) of the patient (10) developing hypertrophic cardiomyopathy (HCM) before a threshold date, wherein the training process (800) trains the one or more ML medical prognostic models (200) in the following manner: For each of the multiple patients (10), corresponding baseline training data (810) is obtained, which includes baseline multidimensional medical data (110) spanning multiple different modalities. For each of the plurality of patients (10), corresponding follow-up training data (810) is obtained, the follow-up training data including: Follow-up multidimensional medical data (110), which spans multiple different modalities and is collected after the baseline training data (810); and Corresponding real clinical results (818); and The one or more ML medical prognostic models (200) are trained based on the baseline training data (810) and the follow-up training data (810) to teach the one or more ML medical prognostic models (200) how to predict the corresponding real clinical outcomes (818).
2. The computer-implemented method (1400) according to claim 1, wherein the one or more ML medical prognosis models (200) include at least one of the following: survival model, neural network (200a), convolutional neural network (CNN) (200b), attention-based neural network, generative neural network (200c), autoencoder, variational autoencoder (VAE) (200d), regression model, linear model, nonlinear model, support vector machine, decision tree model, random forest model, ensemble model, Bayesian model, naive Bayesian model, k-means model, k-nearest neighbor model, principal component analysis, Markov model, and any combination thereof.
3. The computer-implemented method (1400) according to claim 2, wherein the regression model includes a multivariate survival model or a multivariate time response function model (mTRF model), wherein the mTRF model includes independent variables corresponding to the multidimensional medical data (110).
4. The computer-implemented method (1400) according to claim 2 or 3, wherein the generative neural network (200c) includes a conditional generative adversarial network (cGAN), the cGAN including one or more conditions corresponding to the multidimensional medical data (110).
5. The computer-implemented method (1400) according to any one of claims 1 to 4, wherein the baseline multidimensional medical data (110) is collected for the patient (10) within a threshold time period starting from an index date corresponding to the time when the patient (10) was diagnosed with HCM.
6. The computer-implemented method (1400) according to any one of claims 1 to 5, wherein the multidimensional medical data (110) comprises at least one of the following: Medical imaging data; Cardiac measurement data; Clinical data; Electrocardiogram data; Laboratory test results; Genomic data; or Functional test results.
7. The computer-implemented method (1400) according to any one of claims 1 to 6, wherein: The patient (10) was assessed as Grade I by the New York Health Association (NYHA); and The predicted probability (202) includes the predicted probability that the patient's (10) HCM will progress to NYHA level II or higher before the threshold date.
8. The computer-implemented method (1400) according to any one of claims 1 to 6, wherein: The patient (10) included an early-stage, grade II New York Health Association (NYHA) level assessment; and The predicted probability (202) includes the predicted probability that the patient's (10) HCM will progress to NYHA level III or higher before the threshold date.
9. The computer-implemented method (1400) according to any one of claims 1 to 6, wherein the prediction probability (202) includes prediction probabilities of at least one of the following: The patient (10)'s HCM changed from non-obstructive HCM to obstructive HCM before the threshold date; or Initiate HCM treatment before the stated threshold date.
10. The computer-implemented method (1400) according to any one of claims 1 to 6, wherein the predicted probability (202) includes a predicted probability that the patient (10) will experience a cardiovascular event including at least one of the following: Cardiovascular-related hospitalization; The new diagnosis is atrial fibrillation; Heart failure episode requiring treatment; A fatal ventricular arrhythmia that can lead to cardiac arrest; Appropriate implantable cardioverter defibrillator shock; Transient ischemic attack (TIA); Stroke; die; Acute myocardial infarction; pVO2 deterioration; or The LVOT pressure step deteriorated.
11. The computer-implemented method (1400) according to any one of claims 1 to 10, wherein the operation further comprises: Receive follow-up multidimensional medical data (110) of the patient (10), the follow-up multidimensional medical data (110) being collected for the patient (10) during follow-up visits; Extract follow-up features from the multidimensional medical data (110) of the follow-up period; as well as Using one or more trained ML medical prognostic models (200) to process the extracted features to predict the probability of HCM progression occurring before the threshold date is further based on using the one or more trained ML medical prognostic models (200) to process the extracted follow-up features.
12. The computer-implemented method (1400) according to any one of claims 1 to 11, wherein the operation further comprises: The predicted probability (202) is determined to satisfy the threshold probability value; as well as Based on the determination that the predicted probability (202) meets the threshold probability value, the patient (10) is selected to be included in the clinical trial.
13. The computer-implemented method (1400) according to any one of claims 1 to 11, wherein the operation further comprises: The predicted probability (202) is determined to satisfy the threshold probability value; as well as Based on the determination that the predicted probability (202) satisfies the threshold probability value, the patient is treated with a cardiac myosin inhibitor (10).
14. A system (100) comprising: Data processing hardware (1710); as well as A memory hardware (1720) communicating with the data processing hardware (1710) stores instructions that, when executed on the data processing hardware (1710), cause the data processing hardware (1710) to perform a computer-implemented method according to any one of claims 1 to 13.
15. A computer-implemented method (1500), when executed on data processing hardware (1710), causes the data processing hardware (1710) to perform operations, said operations including: For each specific patient (10) among multiple patients (10) diagnosed with hypertrophic cardiomyopathy (HCM) and meeting the inclusion criteria: Obtain the corresponding baseline training data (810), which includes baseline multidimensional medical data (110) that spans multiple different modalities and is collected for the specific patient (10) within a threshold time period starting from the corresponding index date assigned to the specific patient (10); as well as Obtain the corresponding follow-up training data (810), wherein the follow-up training data includes: Follow-up multidimensional medical data (110), which spans multiple different modalities and is collected for the specific patient (10) after the corresponding index date; and Corresponding real clinical results (818); as well as One or more machine learning (ML) medical prognostic models (200) are trained based on the corresponding baseline training data (810) and the corresponding follow-up training data (810) to teach the one or more ML medical prognostic models (200) to predict the corresponding real clinical outcomes (818).
16. The computer-implemented method (1500) according to claim 15, wherein: Each of the plurality of patients (10) includes a first New York Health Association (NYHA) level assessment on the index date; and For a specific patient (10), the corresponding real clinical outcome (818) includes the patient (10)’s HCM progressing to a second NYHA level assessment that is higher than the first NYHA level assessment.
17. The computer-implemented method (1500) according to claim 15, wherein the corresponding real clinical outcome (818) includes at least one of the following: Transition from non-obstructive HCM to obstructive HCM before the threshold date; or Initiate HCM treatment before the stated threshold date.
18. The computer-implemented method (1500) of claim 15, wherein the corresponding real clinical outcome (818) includes a cardiovascular event occurring before a threshold date, said cardiovascular event including at least one of the following: Cardiovascular hospitalization; The new diagnosis is atrial fibrillation; Heart failure episode requiring treatment; A fatal ventricular arrhythmia that can lead to cardiac arrest; Appropriate implantable cardioverter defibrillator shock; Transient ischemic attack (TIA); Stroke; die; Acute myocardial infarction; pVO2 deterioration; or The LVOT pressure step deteriorated.
19. The computer-implemented method (1500) according to any one of claims 15 to 18, wherein the multidimensional medical data (110) spanning multiple different modalities includes at least one of the following: Medical imaging data; Cardiac measurement data; Clinical data; Electrocardiogram data; Laboratory test results; Genomic data; or Functional test results.
20. The computer-implemented method (1500) according to any one of claims 15 to 19, wherein training the one or more ML medical prognostic models (200) based on the corresponding baseline training data (810) obtained for a specific patient (10) and the corresponding follow-up training data (810) comprises, for each of the plurality of different modalities: Extract baseline features associated with the corresponding modality from the corresponding baseline training data (810); Extract follow-up features associated with the corresponding modality from the corresponding follow-up training data (810); and The corresponding modality-specific ML medical prognosis model is trained based on the baseline features and the follow-up features associated with the corresponding modality (200).
21. The computer-implemented method (1500) of any one of claims 15 to 20, wherein, For at least one specific patient (10) among the plurality of patients (10): Obtaining the corresponding baseline training data (810) and the corresponding follow-up training data (810) includes accessing, via federated data access technology, a local storage device storing the corresponding baseline training data (810) and the corresponding follow-up training data (810) of the specific patient (10), the local storage device being controlled by the owner of the corresponding baseline training data (810) and the corresponding follow-up training data (810); and Training the one or more ML medical prognostic models (200) based on the corresponding baseline training data (810) and the corresponding follow-up training data (810) includes training the one or more prognostic models (200) by locally processing the corresponding baseline training data (810) and the corresponding follow-up training data (810) accessed from the local storage device on the corresponding working node controlled by the owner of the corresponding baseline training data (810) and the corresponding follow-up training data (810).
22. A system (100) comprising: Data processing hardware (1710); as well as A memory hardware (1720) communicating with the data processing hardware (1710), the memory hardware (1720) storing instructions that, when executed on the data processing hardware (1710), cause the data processing hardware (1710) to perform a computer-implemented method according to any one of claims 15 to 21.