COPD complication prediction model training method and COPD complication prediction method and device
By combining energy diffusion models with knowledge distillation techniques and multimodal data, a COPD complication prediction model was established, which solved the problem of low prediction rate in existing technologies and enabled accurate identification and risk assessment of COPD complications.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- TONGJI HOSPITAL ATTACHED TO TONGJI MEDICAL COLLEGE HUAZHONG SCI TECH
- Filing Date
- 2026-04-15
- Publication Date
- 2026-06-30
AI Technical Summary
Existing methods for diagnosing COPD complications rely on single-type features, ignoring the potential coupling relationships between multimodal data, resulting in low predictive rates. Furthermore, COPD patient data exhibits a non-Gaussian distribution, making it difficult to effectively capture complex disease changes.
By employing an energy diffusion model and knowledge distillation techniques, and combining multi-task learning with multimodal data, a COPD complication prediction model based on energy diffusion and knowledge distillation is established. The forward and backward diffusion processes are explicitly modeled using energy functions, and the prediction accuracy is improved by combining pre-trained large-scale model knowledge distillation techniques.
It improves the accuracy and flexibility of COPD complication prediction, can identify multiple complication types and severity, enhances the reliability and clinical applicability of the model, and provides more accurate judgment, especially when facing complex or uncertain cases.
Smart Images

Figure CN122025112B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of chronic obstructive pulmonary disease (COPD) complication prediction technology, and in particular to a COPD complication prediction model training method and a COPD complication prediction method and device. Background Technology
[0002] COPD is an irreversible chronic inflammatory airway disease encompassing both chronic bronchitis and emphysema. Its core characteristic is airway obstruction leading to limited airflow, with the condition progressively worsening over time. Chronic obstructive pulmonary disease (COPD) has become a prevalent respiratory disease worldwide, particularly among the elderly. Its high rates of disability and mortality make it a significant challenge for clinical management. COPD presents with numerous complications and a complex course, greatly complicating clinical diagnosis and patient management. Common complications such as pulmonary heart disease, respiratory failure, and pneumonia not only severely impact patients' quality of life but also place a heavy burden on the healthcare system.
[0003] Traditional methods for diagnosing COPD complications often rely on single-type features, such as blood oxygen saturation and pulmonary function tests, while neglecting the potentially high degree of coupling between multimodal data (including biochemical indicators, imaging features, and time-series information). Furthermore, the clinical manifestations of COPD patients often exhibit a non-Gaussian distribution, and the data show significant imbalances. This makes it difficult for predictive models based on simple assumptions to effectively capture complex disease changes, resulting in the current low predictive rate for COPD complications.
[0004] To address this challenge, this invention, in the context of multimodal data-driven COPD complication identification and risk assessment, employs an energy function-guided diffusion model to characterize forward and backward diffusion distributions. This model more flexibly simulates noise injection and denoising processes and accurately classifies complication stages and severity by combining multi-task outputs within the latent space. Because the energy function better adapts to non-Gaussian distributions and allows for multi-peak structures, it possesses strong expressive power in multimodal data fusion, providing higher sensitivity and accuracy in key tasks such as early detection and diagnosis of rare complications. Furthermore, the use of pre-trained large-model knowledge distillation technology further enhances the model's predictive accuracy, meeting diverse clinical needs for immediate diagnosis and subsequent in-depth analysis. Summary of the Invention
[0005] In view of the above problems, this application provides a COPD complication prediction model training method and a COPD complication prediction method and device, which aim to improve the prediction accuracy by establishing a COPD complication prediction model based on energy diffusion and knowledge distillation to address the problem of low prediction accuracy of chronic obstructive pulmonary disease complications.
[0006] In a first aspect, embodiments of this application provide a method for training a COPD complication prediction model, the method comprising:
[0007] Multimodal data of COPD patients were acquired and preprocessed to obtain preprocessed clinical and imaging features;
[0008] Clinical and imaging features are input into a COPD complication prediction model based on energy diffusion and knowledge distillation to obtain complication classification results or risk scores corresponding to clinical and imaging features.
[0009] The cross-entropy loss value is determined based on the complication classification results or risk scores and the true labels of the data.
[0010] The total loss of the model is determined based on the cross-entropy loss value and the double knowledge distillation loss value.
[0011] Based on the total loss value of the model, the model parameters of the COPD complication prediction model are adjusted, and the model after the last parameter adjustment is used as the target COPD complication prediction model.
[0012] Optionally, the COPD complication prediction model based on energy diffusion and knowledge distillation includes a feature encoding network and an energy diffusion model based on knowledge distillation, wherein:
[0013] Feature encoding networks are used to map input features to a latent representation in a latent space;
[0014] An energy diffusion model based on knowledge distillation is used to diffuse the potential representation in the forward process and predict the category of complications caused by COPD and the risk score of the complication in the reverse process.
[0015] Optionally, the energy diffusion model based on knowledge distillation is constructed through the following steps:
[0016] Define a forward diffusion process controlled by a set of learnable energy functions, gradually increasing the noise as time step t increases from 0 to T, and allowing the latent representation z of time step t to... t It approaches a simple prior;
[0017] Define a reverse generation process, controlled by another set of learnable energy functions, or the same set as in the forward diffusion process, from the latent representation z at time step T. T Gradually denoise and return to the initial latent representation z0;
[0018] At the end of the reverse generation process, a noise prediction branch and a complication classification branch are connected to predict and output the complication type or risk score.
[0019] Optionally, in the noise prediction branch, the pre-trained MedicalCLIP model is used as the teacher model and the energy diffusion model is used as the student model. The energy diffusion model is used as the student model and the teacher model for knowledge distillation, so as to obtain the prediction feature that best fits the potential representation z0 at the initial time.
[0020] Optionally, the step of inputting clinical and imaging features into a COPD complication prediction model based on energy diffusion and knowledge distillation to obtain complication classification results or risk scores corresponding to the clinical and imaging features includes:
[0021] Preprocessed clinical and imaging features are mapped to latent representations at the initial time using a feature encoding network;
[0022] The potential representation at the initial moment is forward-divided through T steps;
[0023] In each time step of the reverse generation process, KL divergence alignment and gradient matching terms of the denoised prediction mean are introduced to determine the double knowledge distillation loss. The trained noise prediction branch is obtained by minimizing the double knowledge distillation loss.
[0024] The latent representation z of the time step t based on the trained noise prediction branch output t The denoising process is performed to obtain denoised features, which are then input into the complication classification branch to output the complication classification results or risk scores.
[0025] Optionally, the double knowledge distillation loss is:
[0026]
[0027] in, and It is a weighting coefficient that is dynamically adjusted with time step t to balance the contributions of mean alignment and gradient alignment; KL(·) represents the divergence; It is the denoised prediction probability distribution of the student model at time step t; The teacher model is based on high-level features f large The denoised prediction probability distribution at time step t; and Let z represent the student model and the teacher model at time step t, respectively, representing the latent representation z. t Denoising prediction gradient, This represents the denoised prediction mean function of the student model. This represents the denoised prediction mean function of the teacher model.
[0028] Optionally, the COPD complication prediction model based on energy diffusion and knowledge distillation is further provided with a COPD staging diagnosis branch at the end, which is used to output the risk level of COPD based on the input potential representation.
[0029] Secondly, embodiments of this application provide a method for predicting COPD complications, the method comprising:
[0030] Acquire multimodal data of the target patient;
[0031] The preprocessed multimodal data is input into the target COPD complication prediction model for complication prediction; the target COPD complication prediction model is obtained according to a COPD complication prediction model training method.
[0032] Output the complication category or risk score of the target patient.
[0033] Thirdly, embodiments of this application provide a COPD complication prediction device, the device comprising:
[0034] The acquisition module is used to acquire multimodal data of the target patient;
[0035] The detection module is used to input preprocessed multimodal data into a target COPD complication prediction model for complication prediction; the target COPD complication prediction model is obtained according to a COPD complication prediction model training method.
[0036] The output module is used to output the complication category or risk score of the target patient.
[0037] Compared with the prior art, the specific beneficial effects of the present invention are as follows:
[0038] First, because clinical data of COPD patients often exhibit non-Gaussian distributions and contain multimodal distributions, existing models typically assume a Gaussian distribution or ignore the complexity of the distribution. This invention utilizes the Energy Diffusion Model (EBDM), which introduces an energy function to explicitly model the forward and backward diffusion processes. This allows for flexible adaptation to the non-Gaussian noise characteristics and complex distribution structure of the data, thus better fitting the complex data distribution of COPD complications.
[0039] Secondly, given the common occurrence of multiple complications in COPD patients, existing technologies may struggle to effectively capture the interactions between these complications, only predicting a single type of complication. This invention, however, utilizes multi-task learning to combine complication classification and risk scoring into a single model, effectively identifying complication types and severity. It can also handle situations where patients have multiple complications and simultaneously predict various potential complications.
[0040] Furthermore, this invention utilizes pre-trained large medical models (such as MedicalCLIP) for knowledge distillation, transferring knowledge from external large models to COPD complication prediction. Through dual knowledge distillation loss (including KL divergence alignment and gradient matching of the denoised prediction mean), the model's learning ability is effectively improved, resulting in higher accuracy in noise prediction and classification tasks.
[0041] Finally, by introducing uncertainty quantification (such as variance prediction term), this invention provides a confidence component for each diagnostic result, which helps clinicians make more accurate judgments when facing difficult cases, thereby enhancing the reliability and clinical applicability of the model, especially when dealing with complex or uncertain cases. Attached Figure Description
[0042] To more clearly illustrate the technical solutions of the embodiments of this application, the drawings used in the description of the embodiments of this application will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0043] Figure 1 This is a flowchart of the method proposed in this invention.
[0044] Figure 2 This is a diagram of the architecture of the proposed COPD complication prediction model.
[0045] Figure 3 It is the receiver operating characteristic (ROC) curve in the experimental case.
[0046] Figure 4 It is the calibration curve of the average predicted probability of subjects in the experimental cases. Detailed Implementation
[0047] Exemplary embodiments of this application will now be described in more detail with reference to the accompanying drawings. While exemplary embodiments of this application are shown in the drawings, it should be understood that this application may be implemented in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided to enable a more thorough understanding of this application and to fully convey the scope of this application to those skilled in the art.
[0048] Example 1: As Figures 1-2 As shown in this embodiment, a method for training a COPD complication prediction model is provided. The method includes the following steps:
[0049] Step 1: Acquire multimodal data of COPD patients and perform preprocessing to obtain preprocessed clinical and imaging features;
[0050] Specifically, for N COPD patients, multimodal data were acquired for each patient. This multimodal data included three categories: numerical clinical indicators, fundus imaging features, and sequence data. Numerical clinical indicators included: pulmonary function indicators (such as FEV1 and FVC), blood oxygen saturation, blood gas analysis results, inflammatory markers (such as C-reactive protein), and clinical history (such as smoking history and medication use). Sequence data included imaging features (such as chest CT images and MRI images). Fundus imaging features refer to quantitative indicators of retinal vessels extracted from the patient's fundus photographs, such as vessel diameter and tortuosity. Since COPD causes systemic hypoxia and inflammation, thereby damaging microvessels throughout the body, and the fundus vessels are the only microvascular system that can be directly observed non-invasively, their changes can serve as biomarkers reflecting the systemic effects of COPD. This invention, by acquiring fundus imaging features, can help predict complications.
[0051] After acquiring multimodal data from different COPD patients, these data need to be preprocessed. The data preprocessing stage includes the following preprocessing operations:
[0052] (1) Standardize all numerical multimodal data to ensure that the mean of each data is 0 and the standard deviation is 1;
[0053] (2) Given that different types of data have different characteristics, this embodiment employs multiple encoders to process each type of data independently, thereby converting the different types of raw data into a unified M-dimensional feature vector. Specifically: for numerical clinical indicators, this invention uses a multilayer perceptron (MLP) for encoding; for fundus image features, deep neural networks containing convolutional layers, such as ResNet, are used for extraction; and for sequence data, an attention-based Transformer network is used for encoding. The preprocessed multimodal data, i.e., the independently encoded features, are used as M-dimensional clinical and imaging features.
[0054] Step 2: Input the clinical and imaging features into the COPD complication prediction model based on energy diffusion and knowledge distillation to obtain the complication classification results or risk scores corresponding to the clinical and imaging features;
[0055] Specifically, let X i Let y represent the M-dimensional clinical and imaging characteristics of the i-th patient. i∈{0,1,…,K} represents the complication type label or stage level, where 0 represents no complications, and 1 to K correspond to different complications or different degrees of severity. This invention establishes a reversible mapping between the feature space and the latent space, which can map the encoded features (M-dimensional clinical and imaging features) of COPD patient multimodal data into the latent space. Through energy diffusion and backsampling, it can then predict and score the complications caused by COPD in the latent dimension.
[0056] The COPD complication prediction model based on energy diffusion and knowledge distillation includes a feature encoding network and an energy diffusion model based on knowledge distillation. The feature encoding network is used to map the input features to a latent representation (latent variable) in the latent space. The energy diffusion model based on knowledge distillation is used to perform energy diffusion on the latent representation in the forward process and predict the type of complication caused by COPD and the risk score of the complication in the reverse process. The risk score refers to the probability that a patient may have a certain complication (such as pulmonary heart disease).
[0057] The COPD complication prediction model based on energy diffusion and knowledge distillation can be constructed through the following steps:
[0058] First, a feature encoding network is defined. ,in, This is an M-dimensional initial feature space (i.e., M-dimensional clinical and imaging features) generated and stitched together by multiple encoders in the preprocessing stage. For the feature coding network After deep fusion and mapping, the resulting d-dimensional latent representation space is obtained.
[0059] make These are the trainable parameters of the feature encoding network. This represents the latent representation of the clinical and imaging characteristics of patient x. Feature encoding network This can be a multi-branch Transformer network that includes an attention structure. For ease of final overall optimization, this invention updates the network simultaneously during joint training. The parameters of the diffusion network (i.e., the energy diffusion model) below are used to learn the optimal latent representation end-to-end.
[0060] Secondly, an energy diffusion model based on knowledge distillation (EBDM) is constructed. This specifically includes:
[0061] 1) Define a forward diffusion process It is controlled by a set of learnable energy functions, which gradually increase the noise as the time step t increases from 0 to T, and make the latent representation z of time step t... t It tends to a simple prior (such as a Gaussian distribution);
[0062] 2) Define a reverse generation process It is controlled by another set of learnable energy functions, or the same set as the forward diffusion process, from the latent representation z at time step T. T Stepwise denoising is performed to return to the patient's initial latent representation z0;
[0063] 3) Connect the noise prediction branch, complication classification or output head at the end of the reverse generation process to make deterministic predictions of complication type, risk score and stage risk.
[0064] Unlike traditional DDPM (Denoising Diffusion Probabilistic Models) which directly define Gaussian diffusion, this invention explicitly models both the forward and backward diffusion distributions using energy functions. Therefore, it can more flexibly adapt to the complex distribution of COPD complications and non-Gaussian noise characteristics.
[0065] The initial latent representation z0 corresponds to the true latent distribution q(z0). At step t, the latent representation is denoted as z. t The energy function used in forward diffusion in this invention is: The energy function used in the reverse process is The forward diffusion process can be defined as:
[0066]
[0067] Among them, z t For the latent representation of time step t, z t-1 Let exp represent the potential representation of time step t-1, and let exp denote the exponential function. This indicates a direct proportionality. Simultaneously, the reverse generation process is defined as:
[0068]
[0069] For the potential representation z0 at the initial time and the potential representation z at the final time T T We also need to define the boundary distribution: Let This represents multiple synthesis by forward T-step diffusion; if z T Approximating a prior distribution During training, the energy function, after T steps, can make the data distribution approximate a standard normal distribution, i.e. Where μ is the standard normal distribution. .
[0070] To embed complication diagnosis into energy diffusion, let the reverse network... Output a noise prediction branch and a complication classification branch. Specifically, It is split into two parts: one part serves as the noise prediction branch, controlling z. t-1 One part is used to denoise the energy (corresponding to the form of the sampling distribution); the other part is used as a complication classification branch to output the classification results or risk scores of the complications.
[0071] Furthermore, to enhance the learning ability of the noise prediction branch on COPD-related feature representations, knowledge distillation alignment is performed within the noise prediction branch. Prior knowledge from the expert model is introduced through knowledge distillation as guidance, ensuring that the denoising process intelligently retains the most critical medical features for downstream complication prediction tasks. This makes the denoising process more accurate, obtaining denoised features that best fit the initial latent representation z0.
[0072] Specifically, in this embodiment, the MedicalCLIP model is trained on publicly available COPD datasets (including the COPDGene database provided by the National Institutes of Health (NIH) and multimodal data from the UK Biobank dataset) to obtain a pre-trained MedicalCLIP model. This pre-trained MedicalCLIP model is used as the teacher model, and the energy diffusion model is used as the student model to perform knowledge distillation with the teacher model.
[0073] Optionally, step 2 may include the following sub-steps:
[0074] Step 2.1: Map the preprocessed clinical and imaging features of the patient to the patient's initial latent representation using a feature encoding network;
[0075] Step 2.2: Perform forward energy diffusion on the initial latent representation through T steps;
[0076] Specifically, the initial latent representation z0 obtained from the patient's data is diffused and sampled in time steps from t=1 to t=T. ;
[0077] Step 2.3: In the reverse generation process, the latent representation z at time step t is generated using the trained noise prediction branch. t Denoising is performed to obtain denoised features, and these denoised features are input into the complication classification branch to output the complication classification result or risk score.
[0078] Specifically, based on the pre-trained MedicalCLIP model, this invention designs a double knowledge distillation loss by introducing KL divergence alignment and gradient matching terms of the denoised prediction mean at each time step in the inverse generation process. By minimizing the loss of double knowledge distillation The noise prediction branch is trained to improve the learning ability on COPD-related feature representations. The double-knowledge distillation loss is described above. for:
[0079]
[0080] in, and It is a weighting coefficient that is dynamically adjusted with time step t, used to balance the contributions of mean alignment and gradient alignment; Indicates divergence; This is the denoised prediction probability distribution of the student model EBDM at time step t; The Medical CLIP teacher model is based on high-level features f large The denoised prediction probability distribution at time step t, the high-level feature f large It is a high-level feature representation rich in medical semantics obtained by encoding raw, clean patient multimodal data using a teacher model; and Let z represent the student model and the teacher model at time step t, respectively, representing the latent representation z. t The denoising prediction gradient.
[0081] By minimizing the loss of double knowledge distillation This allows EBDM to not only gradually approximate the vector distribution of Medical CLIP in denoising mean prediction, but also to synchronize the gradient change trend of denoising prediction, ensuring that the model can accurately reflect the dynamic changes of the real data distribution in each step of denoising.
[0082] The complication classification branch outputs a classification result (i.e., the type of complication, such as pulmonary heart disease, respiratory failure, etc.) and a risk score. The risk score predicts the probability that a patient will develop a certain complication; for example, patient A has a 70% risk of developing pulmonary heart disease. The complication classification branch uses a softmax activation function at its terminal to simultaneously obtain both classification results and the risk score. The direct output of this activation function is the risk score (i.e., probability value) for each complication. The classification result selects the category with the highest probability among all risk scores as the final predicted label. Therefore, the risk score is fundamental, and the classification result is the final judgment made based on that score.
[0083] use The output of the complication classification branch is represented by the variable c. t As z t At time t, the risk score is used, and the category prediction is a final decision made based on the risk score. During training, at each sample z... t Afterwards, through Output prediction result p tIt can be based on p t Calculate the classification loss using the true label y. In a multi-task design, let... The risk score for complication m is indicated.
[0084] Step 3: Determine the cross-entropy loss value based on the risk score and the true labels of the data;
[0085] In a multi-task context, the cross-entropy loss function is:
[0086]
[0087] in, For the true label of complication m, For the corresponding weights or hyperparameters, This represents the expression for the latent representation z given the input z0. t posterior distribution Seeking expectations, The risk score for complication m indicates the risk of complications. M represents the set of trainable parameters, where M represents the total number of complication categories.
[0088] Step 4: Determine the total loss of the model based on the cross-entropy loss and the double knowledge distillation loss;
[0089] This invention minimizes both the diffusion KL term and the cross-entropy classification loss term during a unified training process, letting:
[0090]
[0091] in, This is a balancing term used to determine the relative weights of noise distribution learning and classification tasks.
[0092] Step 5: Based on the total loss value of the model, adjust the model parameters of the COPD complication prediction model, and use the model after the last parameter adjustment as the target COPD complication prediction model;
[0093] This invention first uses a forward energy function in each minibatch. Diffusion of z0 to z t Then use the inverse energy function For z t Denoising sampling is performed, and the prediction results are output using the complication classification branch. Then, the loss is accumulated and backpropagation is performed to update the results. This cycle repeats itself, and the energy diffusion process and the classification of complications achieve good synergy within the same potential space.
[0094] Because in the same forward-backward sampling z t-1The gradient is redefined and requires the use of reparameterization techniques or stochastic gradient estimation methods (such as SGLD or Reinforce style) to ensure that the gradient can be backpropagated. and After training convergence, the model acquires full sampling and classification capabilities in the latent space. MedicalCLIP is frozen during the training of the diffusion model.
[0095] Example 2: This application also provides a method for predicting COPD complications, which may include the following steps:
[0096] Step 1: Acquire multimodal data of the target patient;
[0097] Step 2: Input the preprocessed multimodal data into the target COPD complication prediction model; the target COPD complication prediction model is obtained according to a COPD complication prediction model training method;
[0098] Step 3: Determine the current inference mode. If the inference mode is the diagnosis mode, output the complication category and risk score of the target patient. Otherwise, if the inference mode is the generation and comparison mode, determine the most likely type of complication for the patient.
[0099] When using a trained target COPD complication prediction model for inference, two modes are employed: a diagnostic mode and a generation / comparison mode. In the diagnostic mode, given multimodal data x of a new patient, the model first... Mapped to z0, several steps of denoising and updating are performed in the inverse model (or a single inference is performed directly). This allows for the output of complication categories or risk scores. This mode is relatively fast and suitable for real-time screening.
[0100] In the generation comparison mode, from the prior Or increase the noise from z0 to z t After backsampling several times, a sample set is obtained. , Specifically, this refers to a series of data samples generated under a fixed assumption (e.g., "cor pulmonale"), representing the "feature distribution" of various typical manifestations of this complication, rather than a single feature point. This is achieved by comparing the latent representation z0 of the patient's actual data with the set generated by the model under the specific complication assumption. The reconstruction distance of each data point is used to select the sample closest to z0 for further diagnosis. The latter mode takes longer, but the comparison results can take into account the diversity of disease manifestations, making them more stable and reliable, and more valuable in early risk analysis and disease simulation. Because this invention is strictly based on an energy function, it allows non-Gaussian noise and supports multi-peak distributions, thus enabling the generation of comparative modes to present multiple possible progressions of COPD complications.
[0101] The comparison results generated in the comparative mode provide doctors with highly interpretable diagnostic clues, guiding the next steps in diagnosis and treatment. For example, if the calculated distance between a patient's data and the "pulmonary heart disease" sample set is much smaller than that of other complications, this strongly suggests that the patient's condition pattern is highly similar to pulmonary heart disease, identifying the most likely type of complication. Based on this, doctors can more specifically plan the next steps of examination (such as echocardiography and electrocardiogram) instead of conducting a broad screening. This mode not only provides strong evidence for early risk analysis but also helps doctors focus their diagnostic thinking when symptoms are atypical, thereby improving diagnostic efficiency and accuracy.
[0102] To avoid the randomness of the diffusion model affecting the accuracy of the results, uncertainty quantification is introduced by adding a variance prediction term to the inverse energy function or directly adjusting z. t-1 Energy shape estimation is performed based on the distribution. Let... To represent the variance prediction of backsampling, let the energy function have an extra term:
[0103]
[0104] in, To penalize extreme values of variance and avoid numerical instability; As a weighted term, It is a learnable energy correction term;
[0105] During training in z t Sampling and z t-1 Embedding log-likelihood or score matching terms between samples ensures It can accurately characterize the system's noise level. This is achieved by performing Monte Carlo sampling, i.e., multiple runs with variance prediction. The process involves a random reverse process, and ultimately, by calculating the consistency of the classification probability results from these N independent predictions, the confidence component is determined, and the confidence component is output through the complication classification branch. For the classification result p... t Using temperature scaling or Dirichlet distribution models, let p t The system incorporates uncertainty measurement during inference. Through this dual design of energy and variance, the system's output on the risk of COPD complications includes not only the most probable classification but also an uncertainty estimate, facilitating in-depth examination of complex cases.
[0106] In multi-task learning, common complications of COPD include pulmonary heart disease, respiratory failure, and pneumonia, with patients often exhibiting multiple concurrent or overlapping conditions. This invention addresses multi-task scenarios by defining a complication classification branch containing several sub-branches, with each branch outputting... , representing the classification probability or risk score of the corresponding complication m.
[0107] Training is performed on a task-by-task basis. The m losses are weighted and accumulated into the overall loss. Since all branches share the same energy diffusion process, the model can capture the interactive features of multiple complications in the latent dimension. For example, when pulmonary heart disease and respiratory failure often occur simultaneously, the model's energy function will affect the corresponding latent representation z. t Mapping to a joint high-energy region or low-energy region accelerates the identification of overlapping features of multiple diseases during backsampling.
[0108] If staging diagnosis is required, such as COPD being divided into mild, moderate, and severe stages, the corresponding risk level can be output by adding an output header to this prediction model. For example, a COPD staging diagnosis branch can be added to the complication classification branch. This branch, like the complication classification branch, receives the latent representation z. t As input, a hierarchical noise schedule is built into the energy diffusion process, allowing early mild states and mid-to-late severe states to exhibit separable clustering in the latent space, thus avoiding excessive overlap of features from different stages. This hierarchical noise schedule can be implemented by designing a non-linear, piecewise noise schedule. Larger noise is added at the distinction boundaries between coarse-grained categories (e.g., mild and severe), while a gradual noise increment is set within fine-grained categories. This forces the model to first make macroscopic, large-scale classification decisions during inverse denoising, followed by local, refined classification decisions.
[0109] Regarding the selection of hyperparameters, this invention requires setting the diffusion step number T, the forward and backward network structures, the specific form of the energy function, the variance prediction and uncertainty level, and the task weights. and overall weight Regarding the selection of T, an excessively large value will increase training time and the risk of numerical instability, but it will allow the model more steps to gradually add and remove noise, thus improving the accuracy of distribution fitting. Therefore, T should be discretely selected within a reasonable range (e.g., 50 to 200). A grid search or automatic tuning method based on validation set metrics should be used for the gain function, employing piecewise linear or polynomial forms to ensure an increasing or decreasing trend, thus meeting the convergence requirements of forward noise enhancement and inverse denoising.
[0110] Experimental Case: The complication prediction method proposed in this invention was used to predict complications in patients at a hospital. First, the target COPD complication prediction model was deployed in the hospital information system or research platform. Second, the specific execution process of using this COPD complication prediction method included:
[0111] (1) Collect multimodal data of COPD patients in a hospital, including pulmonary function tests, blood gas analysis, imaging data (such as chest CT scans), blood tests, and medical history (such as smoking history, medication use, etc.). Imputation is performed on missing values to ensure... Dimensional completeness;
[0112] (2) Through feature coding network Obtain the latent representation z0 of x;
[0113] (3) In diagnostic mode, input z0 into the reverse branch. Immediately calculate the multi-task output p0 to obtain the complication type or risk score; or perform several steps of noise sampling and denoising iteration z0→z t → →z ’ 0, based on the output p at each step t Finally, the summary and classification results are obtained;
[0114] (4) Based on the classification results or the risk curve drawn based on the risk score, the clinician reviews the results and decides on the treatment and intervention plan in conjunction with other examination results;
[0115] (5) For difficult cases and patients with high-risk suspected complications, the mode can be switched to generate comparison mode, starting from the prior z-axis. T Multiple samplings are used to generate potential samples, which are then matched and compared with the real z0 to observe the complication pathways that are most likely to fit, thus helping to determine the type or severity of potential lesions.
[0116] Performance of complication classification was assessed based on receiver operating characteristic (ROC) curves, combined with Figure 3 As can be seen, the ROC curve is far from the diagonal and close to the coordinate axis boundary at the high position, indicating that the model can obtain a high true positive rate with a low false positive rate. It can also be seen that the AUC values of this model reached 0.93 and 0.91 respectively on the two major complication prediction tasks of pulmonary heart disease and type II respiratory failure, which confirms that the present invention has a good ability to identify complication types.
[0117] In terms of early detection, based on a pre-labeled test set for early complications, complications can be predicted for patients at this stage, yielding results such as... Figure 4 The calibration curve is shown. The horizontal axis of the calibration curve represents the average predicted probability for each equal-frequency bin, and the vertical axis represents the positive rate of the corresponding bin; from... Figure 4As can be seen, the scatter points are close to the diagonal, indicating that the model did not systematically overestimate or underestimate the risk across the full confidence interval. Then, after isothermal regression and mild post-calibration (α=0.88) of the risk output of this invention, the predicted probability and observation frequency are basically consistent within each bin (Brier≈0.12, ECE≈0.04), indicating that the probability of its classification output has high calibration and the results are reliable.
[0118] For extremely rare complications (such as COPD combined with pulmonary hypertension), this model has successfully captured their distribution characteristics in the energy space with fine precision by adding rare samples to the pre-trained model. In terms of implementation details, the entire system has been deployed on a cloud GPU cluster, achieving high-throughput training by setting the batch size to 64. To meet the needs of real-time clinical applications, the number of inverse steps T was pruned to 20 steps during deployment, reducing the single inference time to less than 500 milliseconds, and enabling rapid output of complication types using a classification head.
[0119] Example 3: This example provides a COPD complication prediction device, the device comprising:
[0120] The acquisition module is used to acquire multimodal data of the target patient;
[0121] The detection module is used to input the preprocessed multimodal data into the target COPD complication prediction model for complication prediction; the target COPD complication prediction model is obtained according to a COPD complication prediction model training method of the above embodiment;
[0122] The output module is used to output the complication category or risk score of the target patient.
[0123] Although preferred embodiments of the present application have been described, those skilled in the art, upon learning the basic inventive concept, can make other changes and modifications to these embodiments. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments as well as all changes and modifications falling within the scope of the embodiments of the present application.
[0124] Finally, it should be noted that in this text, relational terms such as "first" and "second" are used merely to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or terminal device that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or terminal device. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or terminal device that includes said element.
[0125] The above provides a detailed description of the COPD complication prediction model training method, COPD complication prediction method, and apparatus provided in this application. Specific examples have been used to illustrate the principles and implementation methods of this application. The descriptions of the above embodiments are only for the purpose of helping to understand the method and its core ideas. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of this application. Therefore, the content of this specification should not be construed as a limitation of this application.
Claims
1. A method for training a COPD complication prediction model, characterized in that, The method includes: Multimodal data of COPD patients were acquired and preprocessed to obtain preprocessed clinical and imaging features; Clinical and imaging features are input into a COPD complication prediction model based on energy diffusion and knowledge distillation to obtain complication classification results or risk scores corresponding to clinical and imaging features. The cross-entropy loss value is determined based on the complication classification results or risk scores and the true labels of the data. The total loss of the model is determined based on the cross-entropy loss value and the double knowledge distillation loss value. Based on the total loss value of the model, the model parameters of the COPD complication prediction model are adjusted, and the model after the last parameter adjustment is used as the target COPD complication prediction model.
2. The method according to claim 1, characterized in that, The COPD complication prediction model based on energy diffusion and knowledge distillation includes a feature encoding network and an energy diffusion model based on knowledge distillation, wherein: Feature encoding networks are used to map input features to a latent representation in a latent space; An energy diffusion model based on knowledge distillation is used to diffuse the potential representation in the forward process and predict the category of complications caused by COPD and the risk score of the complication in the reverse process.
3. The method according to claim 2, characterized in that, The energy diffusion model based on knowledge distillation is constructed through the following steps: Define a forward diffusion process controlled by a set of learnable energy functions, gradually increasing the noise as time step t increases from 0 to T, and allowing the latent representation z of time step t to... t It approaches a simple prior; Define a reverse generation process, controlled by another set of learnable energy functions, or the same set as in the forward diffusion process, from the latent representation z at time step T. T Gradually denoise and return to the initial latent representation z0; At the end of the reverse generation process, a noise prediction branch and a complication classification branch are connected to predict and output the complication type or risk score.
4. The method according to claim 3, characterized in that, In the noise prediction branch, the pre-trained MedicalCLIP model is used as the teacher model and the energy diffusion model is used as the student model. The energy diffusion model is used as the student model and the teacher model for knowledge distillation to obtain the prediction feature that best fits the potential representation z0 at the initial time.
5. The method according to claim 4, characterized in that, The process of inputting clinical and imaging features into a COPD complication prediction model based on energy diffusion and knowledge distillation yields complication classification results or risk scores corresponding to the clinical and imaging features, including: Preprocessed clinical and imaging features are mapped to latent representations at the initial time using a feature encoding network; The potential representation at the initial moment is forward-divided through T steps; In each time step of the reverse generation process, KL divergence alignment and gradient matching terms of the denoised prediction mean are introduced to determine the double knowledge distillation loss. The trained noise prediction branch is obtained by minimizing the double knowledge distillation loss. The latent representation z of the time step t based on the trained noise prediction branch output t The noise is denoised to obtain denoised features, which are then input into the complication classification branch to output the complication classification result or risk score.
6. The method according to claim 5, characterized in that, The loss from the double knowledge distillation is: in, and It is a step over time t Dynamically adjusted weighting coefficients are used to balance the contributions of mean alignment and gradient alignment; Indicates divergence; It is the denoised prediction probability distribution of the student model at time step t; The teacher model is based on high-level features f large The denoised prediction probability distribution at time step t; and Let z represent the student model and the teacher model at time step t, respectively, representing the latent representation z. t Denoising prediction gradient, This represents the denoised prediction mean function of the student model. This represents the denoised prediction mean function of the teacher model.
7. The method according to claim 2, characterized in that, The COPD complication prediction model based on energy diffusion and knowledge distillation also includes a COPD staging diagnosis branch at the end, which outputs the risk level of COPD based on the input potential representation.
8. A method for predicting COPD complications, characterized in that, include: Acquire multimodal data of the target patient; The preprocessed multimodal data is input into the target COPD complication prediction model for complication prediction; The target COPD complication prediction model is obtained by a COPD complication prediction model training method according to any one of claims 1-7; Determine the current inference mode. If the inference mode is diagnosis mode, output the complication category and risk score of the target patient. Otherwise, if the inference mode is generation and comparison mode, determine the type of complication of the patient.
9. A method for predicting COPD complications as described in claim 8, characterized in that, If the inference mode is a generative comparison mode, determine the patient's most likely type of complication, including: By increasing the noise from the prior or the initial latent representation z0 to z t Then, reverse sampling is performed several times to obtain a sample set under the complication hypothesis of category S; By comparing the potential representation of the patient's true data with the reconstruction distance of each data point in the sample set under different complication assumptions, the sample that is closest to the potential representation of the patient's true data is selected to determine the patient's complication type.
10. A COPD complication prediction device, characterized in that, The device includes: The acquisition module is used to acquire multimodal data of the target patient; The detection module is used to input the preprocessed multimodal data into the target COPD complication prediction model for complication prediction; the target COPD complication prediction model is obtained by the COPD complication prediction model training method according to any one of claims 1-7; The output module is used to output the complication category or risk score of the target patient.