A prediction model and verification method for HIV / AIDS patient ART virological failure
By constructing a virological failure prediction model for HIV/AIDS patients based on pre-ART clinical data, and using multiple imputation and stepwise regression to screen predictive factors, combined with statistical index evaluation, the problem of existing models relying on post-ART data was solved, achieving efficient and accurate virological failure prediction, and improving treatment efficacy and resource utilization.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- WUXI PEOPLES HOSPITAL
- Filing Date
- 2024-04-10
- Publication Date
- 2026-06-26
AI Technical Summary
Existing ART virological failure prediction models for HIV/AIDS patients rely on follow-up data after ART, and some predictive factors, such as compliance information, are difficult to obtain, making it difficult to guarantee the accuracy of the models. Furthermore, the modeling process is very difficult, which limits their clinical application.
A virological failure prediction model for ART in HIV/AIDS patients was constructed. By collecting pre-ART clinical data from patients, multiple imputation and stepwise regression methods were used to screen predictive factors and establish a virological failure prediction model. The accuracy of the model was evaluated by Wald test and F-distribution adjustment factor selection, combined with Brier score and C-index.
It shortens the time to predict virological failures, improves prediction efficiency and accuracy, enables early identification of high-risk patients, improves patient medication adherence, optimizes resource allocation, and improves treatment outcomes.
Smart Images

Figure CN122290980A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of biomedical technology, specifically to a predictive model and validation method for ART virological failure in HIV / AIDS patients. Background Technology
[0002] Human immunodeficiency virus (HIV) infection remains a major global public health problem, and there is still no viable cure for AIDS. Therefore, suppressing viral replication in HIV-infected individuals through antiretroviral therapy (ART) is crucial. One reason for ART virologic failure (VF) is the numerous side effects of treatment drugs, affecting patient adherence, and the potential for drug resistance and treatment fatigue due to prolonged treatment. Establishing a VF prediction model for first-line HIV antiretroviral therapy to identify patients at high risk of VF would help improve ART adherence and promote the rational allocation of HIV surveillance and treatment resources.
[0003] Research on predictive models for ART virological failure requires a high quantity and quality of samples, as well as a high frequency of follow-up monitoring. Furthermore, the low incidence of virological failure (VF) and the difficulty in obtaining adherence information make modeling challenging, resulting in limited research in this area. Currently, existing VF prediction models abroad require follow-up data after ART to make predictions, and some predictive factors, such as adherence information, are difficult to obtain and their accuracy is uncertain, limiting the clinical application of these models. In addition, existing studies have collected limited baseline information before ART, and the baseline characteristics of VF patients before ART remain unclear. Summary of the Invention
[0004] To address the shortcomings of existing technologies, this invention provides a predictive model and validation method for ART virological failure in HIV / AIDS patients. The model's predictive factors are all follow-up data before ART, which shortens the time to obtain VF prediction results and improves prediction efficiency.
[0005] To achieve the above objectives, the present invention provides the following technical solution: a method for constructing an ART virological failure prediction model for HIV or AIDS patients, comprising the following steps:
[0006] S1. Collect clinical data from HIV or AIDS patients;
[0007] S2. Perform multiple imputation processing on the clinical data to generate multiple datasets;
[0008] S3. In each dataset, use stepwise regression to select predictors;
[0009] S4. Based on the dataset, determine the final prediction factors;
[0010] S5. Based on the final predictor, establish a virological failure prediction model.
[0011] Preferably, the predictive factors include alanine aminotransferase (ALT), route of infection, initial viral load, marital status, and triglycerides.
[0012] Preferably, the multiple filling process includes at least 10 fillings.
[0013] Preferably, the method further includes: adjusting the selection of predictors based on the Wald test and F-distribution.
[0014] Preferably, the method further includes: evaluating the accuracy and calibration of the prediction model using Brier scores and C-index.
[0015] Preferably, the Brier score is 0.07 and the C index is 0.936.
[0016] The present invention also provides a predictive model for ART virological failure in HIV or AIDS patients constructed by the above method.
[0017] The present invention also provides a verification method using the prediction model, comprising the following steps:
[0018] 1) Evaluate predictive models based on new patient clinical data;
[0019] 2) Calculate the Brier score and C-index of the prediction results;
[0020] 3) Determine the accuracy of the model's predictions by comparing the predicted results with the actual results.
[0021] Preferably, the verification method further includes: internally verifying the prediction model using 10-fold cross-validation.
[0022] Preferably, the verification method further includes: using a nomogram to verify the accuracy of the prediction results of the prediction model.
[0023] This invention provides a predictive model and validation method for ART virological failure in HIV / AIDS patients.
[0024] It has the following beneficial effects:
[0025] 1. The predictive model constructed in this invention can serve as an effective tool for predicting the risk of ART virological failure in HIV or AIDS patients in clinical practice. The application of this model helps improve treatment effectiveness, optimize resource allocation, and ultimately improve patient outcomes. Furthermore, by continuously collecting new patient data and re-evaluating model performance, the accuracy and practicality of the model can be further enhanced.
[0026] 2. All predictive variables ultimately included in this invention can be obtained before the start of ART, and the model has a good AUC. This significantly shortens the time required to obtain VF prediction results, allows high-risk patients to be informed of the risk of VF in advance, and can also improve patient medication adherence and ART efficacy to some extent. Furthermore, the predictive factors are all follow-up data before ART, shortening the time to obtain VF prediction results and improving prediction efficiency. Attached Figure Description
[0027] Figure 1 This is a schematic diagram of the model construction method of the present invention;
[0028] Figure 2 This is a line diagram in an embodiment of the present invention;
[0029] Figure 3 This is a schematic diagram of an embodiment of the present invention. Detailed Implementation
[0030] The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0031] Please see the appendix Figure 1 -Appendix Figure 3 This invention provides a method for constructing an ART virological failure prediction model for HIV or AIDS patients, comprising the following steps:
[0032] S1. Collect clinical data from HIV or AIDS patients;
[0033] In this step, the first step is to collect clinical data from HIV or AIDS patients.
[0034] In this embodiment, the research object is first determined:
[0035] Inclusion criteria include:
[0036] ① Initially receive first-line ART;
[0037] ②The diagnosis was made between 2003 and 2022, and there is at least one follow-up record;
[0038] ③ Receive at least 12 months of ART after diagnosis;
[0039] ④The ART solution information is complete;
[0040] ⑤ Newly diagnosed HIV-1 infected individuals aged 15 years or older.
[0041] Exclusion criteria include:
[0042] ① No baseline information for the three months prior to ART;
[0043] ②VL information is incomplete, making it impossible to determine whether VF has occurred;
[0044] ③Pregnant women or breastfeeding women.
[0045] The final sample size was 1059 cases.
[0046] Selection of candidate predictors:
[0047] Based on previous HIV clinical prediction models and some clinical experience, 26 variables were finally selected as candidate predictors (age at diagnosis, sex, marital status, BMI, history of sexually transmitted diseases, route of infection, initial ART treatment regimen, whether to change treatment regimen, WHO clinical classification, HIV-1 subtype, drug resistance, time of delayed treatment (time of first ART treatment - time of diagnosis), viral load (before or at the beginning of ART), CD4 (before or at the beginning of ART), hemoglobin (HB), white blood cell count (WBC), platelet count (PLT), Cr, triglycerides, TCHO, blood glucose, AST, ALT, TBiL, HBV infection, HCV infection, and tuberculosis infection).
[0048] Define outcome metrics:
[0049] Virological failure (VF) refers to a persistent plasma viral load ≥200 copies / mL in patients undergoing continuous ART 24 weeks after initiation (start or adjustment); or virological rebound, which is the recurrence of a viral load ≥200 copies / mL after achieving complete virological suppression.
[0050] S2. Perform multiple imputation processing on clinical data to generate multiple datasets;
[0051] In this step, to address potential missing data in clinical data, a multiple imputation method is employed to generate multiple complete datasets, thereby reducing the impact of missing data on model accuracy. Multiple imputation is a statistical technique that allows us to generate multiple possible complete datasets by randomly imputing missing data multiple times, and then analyze these datasets separately.
[0052] In this embodiment, multiple imputation is used to handle missing values, with missing values of the predictor input multiple times in 10 iterations. A linear relationship is assessed for continuous variables using restricted cubic splines to determine if a linear relationship is satisfied; in the nonlinearity test, p < 0.05 indicates that a linear relationship is not satisfied.
[0053] S3. In each dataset, use stepwise regression to select predictors;
[0054] In this step, a stepwise regression method is used to screen for factors that have a significant predictive effect on virological failure in each imputed dataset. This method involves gradually adding or removing variables through statistical testing until the optimal combination of variables is found.
[0055] In this embodiment, based on multiple imputation, stepwise regression is used to screen candidate predictors, build a model, include variables that appear 10 times in the model, and then screen and compare variables in order of frequency of occurrence. The Wald test is used to determine whether it should appear in the final model. In R language, pool.compare() is used to compare whether excluding a certain variable has an impact. If P>0.05, the variable can be discarded.
[0056] The Wald test formula is:
[0057] statistic is
[0058]
[0059] where the p-value for D w is
[0060]
[0061] where is the Fdistribution with k andν w degrees of freedom, with
[0062]
[0063] S4. Combine the datasets to determine the final prediction factors;
[0064] In this step, factors selected from all imputed datasets are comprehensively considered to determine the final predictor factors. This step ensures the stability and accuracy of the model.
[0065] In this embodiment, after screening, five variables were finally included: alanine aminotransferase (ALT), route of infection, initial viral load, marital status, and triglycerides.
[0066] S5. Based on the final predictor, establish a virological failure prediction model;
[0067] In this step, a virological failure prediction model is built based on the determined final predictor factors. Logistic regression, machine learning algorithms, or other statistical methods can be used to construct the model.
[0068] The model constructed using the method of this invention overcomes the problem of missing data by screening predictive factors through multiple imputation and stepwise regression, and ensures that the predictive factors included in the model have significant predictive value for virological failure. Furthermore, repeating this process on multiple datasets improves the model's robustness and generalization ability. This provides clinicians with a reliable tool to predict the risk of virological failure in patients after ART treatment, thus providing a scientific basis for patient treatment decisions.
[0069] As one embodiment of the present invention, the construction method further includes: adjusting the selection of predictors based on Wald test and F distribution.
[0070] Specifically, based on the stepwise regression method described above for screening predictors, further adjustments are made using the Wald test and F-distribution. The Wald test is a statistical method used to test the significance of predictors in the model. In each stepwise regression step, the Wald test is performed on newly added predictors to assess whether their contribution to predicting virological failure is statistically significant.
[0071] Furthermore, to control the number of predictors in the model and prevent overfitting, an F-distribution-based method is used to adjust the selection of predictors. The F-distribution is a probability distribution commonly used in analysis of variance and regression analysis to compare the explanatory power of models. In this embodiment, we can set a threshold; a predictor is only retained if its inclusion significantly improves the overall explanatory power of the model (i.e., the p-value of the F-test is less than the set significance level).
[0072] By combining the Wald test and an F-distribution-based approach to adjust the selection of predictive factors, the construction method of this invention further improves the accuracy and reliability of the predictive model. The Wald test ensures the significance of each predictor, while the use of the F-distribution helps avoid overfitting due to too many predictors. This strategy enables the final model to generalize better to new patient data, enhancing its practical value in future clinical applications.
[0073] As one embodiment of the present invention, the construction method further includes: evaluating the accuracy and calibration of the prediction model through Brier scores and C-index.
[0074] Specifically, after the model is built, this embodiment uses the Brier score and C-index to evaluate the accuracy and calibration of the prediction model.
[0075] Brier Score: The Brier score is a tool for measuring predictive accuracy; it calculates the mean squared difference between the predicted probability and the actual outcome. The Brier score ranges from 0 to 1, with lower scores indicating higher accuracy. In this embodiment, the Brier score is used to quantify the performance of the predictive model in predicting ART virological failure in HIV or AIDS patients.
[0076] C-index: The C-index (also known as the Concordance index) is a metric for evaluating the ability of a predictive model to rank risks. It is a nonparametric statistic used to measure the ordination of the model's predictions. C-index values range from 0.5 (no predictive ability) to 1 (perfect predictive ability). In this embodiment, the C-index is used to assess the predictive model's ability to distinguish between patients who are likely to experience virological failure and those who are unlikely to experience it.
[0077] These evaluation metrics allow for a comprehensive assessment of the model's predictive performance. A lower Brier score and a higher C-index indicate better predictive accuracy and discriminative ability, making the model suitable for clinical application in predicting the risk of virological failure in HIV or AIDS patients after ART.
[0078] In this embodiment, the ten imputed datasets are stacked into a single dataset for model evaluation. The Brier score is a comprehensive evaluation of the model's discrimination and calibration; a lower score indicates a better overall fit. This model has a Brier score of 0.07, indicating a good fit. Internal validation of the model is evaluated using the AUC index. The C-index is calculated using a 10-fold k-fold cross-validation method. The C-index of the model is calculated in each of the ten imputed datasets, and the average is taken. The final C-index is 0.936, indicating good model calibration.
[0079] Ultimately, statistical validation showed that the established model has high accuracy (Brier score 0.07, C-index 0.936), providing clinicians with a reliable tool to predict the risk of virological failure in patients after ART treatment, thus providing a scientific basis for patients' treatment decisions.
[0080] The present invention also provides a predictive model for ART virological failure in HIV or AIDS patients constructed using the above-described construction method embodiments.
[0081] Based on an embodiment of the construction method of this invention, we provide a predictive model for predicting the risk of virological failure in HIV or AIDS patients after receiving antiretroviral therapy (ART). This model comprehensively considers multiple predictive factors associated with virological failure and is optimized and validated using a data-driven approach.
[0082] 1. Model Construction: Through analysis of historical patient data, this model selected multiple predictive factors, including but not limited to baseline CD4 cell count, viral load, treatment regimen, comorbidities, and socioeconomic factors. These factors were screened and incorporated into the final model using stepwise regression, Wald test, and F-distribution methods.
[0083] 2. Model Evaluation: The accuracy and calibration of the model were evaluated using the Brier score and C-index to ensure that the prediction model has good predictive performance and discrimination ability.
[0084] 3. Application of the model: This predictive model can be applied to clinical decision support systems to help doctors assess the risk of virological failure in patients after receiving ART treatment, thereby providing patients with personalized treatment recommendations and interventions.
[0085] The predictive model constructed in this invention can serve as an effective tool for predicting the risk of ART virological failure in HIV or AIDS patients in clinical practice. The application of this model helps improve treatment effectiveness, optimize resource allocation, and ultimately improve patient outcomes. Furthermore, by continuously collecting new patient data and re-evaluating the model's performance, its accuracy and practicality can be further enhanced.
[0086] The present invention also provides a verification method using the above-described prediction model, comprising the following steps:
[0087] 1) Evaluate predictive models based on new patient clinical data;
[0088] First, the newly collected patient clinical data is evaluated using a predictive model. This can be done by feeding the data into the model and running it to generate a predicted probability of virological failure.
[0089] 2) Calculate the Brier score and C-index of the prediction results;
[0090] Subsequently, the Brier score and C-index were calculated for the predicted results. The Brier score provides a quantitative measure of the consistency between the predicted and actual results, while the C-index measures the accuracy of the model in ranking the risk of virological failure among different patients.
[0091] 3) Determine the accuracy of the model's predictions by comparing the predicted results with the actual results;
[0092] Finally, the model's predictive accuracy is determined by comparing the predicted results with the patients' actual clinical outcomes. This can be done using statistical analysis methods, such as using a confusion matrix to assess the model's sensitivity, specificity, and other relevant performance metrics.
[0093] This validation method provides clinicians with a tool to assess and validate the performance of predictive models in real-world clinical settings. This approach ensures the model maintains high accuracy and reliability with new patient populations. Furthermore, it supports continuous model improvement, as ongoing validation allows for the identification and adjustment of any shortcomings.
[0094] As one embodiment of the present invention, the verification method further includes: performing internal verification of the prediction model using a 10-fold cross-validation method.
[0095] The specific steps are as follows:
[0096] 1. Dataset Partitioning: The patient dataset is randomly divided into 10 mutually exclusive subsets, each of approximately the same size. Each subset should maintain a consistent data distribution as much as possible to ensure the fairness of the validation.
[0097] 2. Cross-validation process: In 10 iterations, a subset is selected as the validation set in each iteration, and the remaining 9 subsets are combined as the training set. The prediction model is trained using the training set, and its performance is evaluated on the validation set.
[0098] 3. Performance Evaluation: In each iteration, the Brier score and C-index, as well as other possible performance metrics such as accuracy, recall, and F1 score, are calculated for the prediction results on the validation set.
[0099] 4. Comprehensive Analysis: After all iterations are completed, the performance metrics of the 10 iterations are combined to calculate the average Brier score and C-index, as well as the average values of other metrics. These metrics can reflect the overall performance and stability of the model.
[0100] The implementation method of using 10-fold cross-validation for internal model validation further confirms the robustness and generalization ability of the predictive model. Since each subset of data has the opportunity to serve as a validation set, this method makes more comprehensive use of limited data resources and reduces bias in the model evaluation process. Furthermore, by averaging performance metrics across multiple iterations, the predictive ability of the model on unknown data can be more accurately evaluated, which is particularly important for clinical applications.
[0101] As one embodiment of the present invention, the verification method further includes: using a nomogram to verify the accuracy of the prediction results of the prediction model.
[0102] A nomogram, also known as an alignment diagram, is based on multivariate regression analysis. It integrates multiple predictive indicators and plots them on a plane using scaled line segments to represent the relationships between variables in the predictive model. This invention uses nomograms to visually display the results of the final predictive model. Figure 2 As shown.
[0103] Using nomograms to validate the accuracy of predictive models provides clinicians with an intuitive tool to assess the model's ability to predict virological failure risks. Through intuitive graphical representations, physicians can quickly identify which patients are at higher risk, enabling more refined treatment management. Furthermore, the results of statistical tests provide scientific validation of the model's predictive power.
[0104] Although embodiments of the invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the appended claims and their equivalents.
Claims
1. A method of constructing a model for predicting ART virological failure in HIV or AIDS patients, characterized in that, Includes the following steps: S1. Collect clinical data from HIV or AIDS patients; S2. Perform multiple imputation processing on the clinical data to generate multiple datasets; S3. In each dataset, use stepwise regression to select predictors; S4. Based on the dataset, determine the final prediction factors; S5. Based on the final predictor, establish a virological failure prediction model.
2. The method of constructing a model for predicting HIV or AIDS patient ART virological failure according to claim 1, characterized in that, The predictors include alanine aminotransferase (ALT), route of infection, initial viral load, marital status, and triglycerides. 3.The method of claim 1, wherein the method is characterized by, The multiple filling process includes at least 10 fillings.
4. The method of constructing a model for predicting HIV or AIDS patient ART virological failure according to claim 1, characterized in that, Also includes: The selection of predictors is based on Wald test and F-distribution adjustment.
5. The method of constructing a model for predicting HIV or AIDS patient ART virological failure according to claim 1, characterized in that, Also includes: The accuracy and calibration of the prediction model were evaluated using the Brier score and the C-index.
6. The method for constructing an ART virological failure prediction model for HIV or AIDS patients according to claim 5, characterized in that, The Brier score is 0.07, and the C index is 0.
936.
7. A predictive model for ART virological failure in HIV or AIDS patients, characterized in that, Constructed by the method described in any one of claims 1-6.
8. A verification method using the prediction model of claim 7, characterized in that, Includes the following steps: 1) Evaluate predictive models based on new patient clinical data; 2) Calculate the Brier score and C-index of the prediction results; 3) Determine the accuracy of the model's predictions by comparing the predicted results with the actual results.
9. The verification method according to claim 8, characterized in that, Also includes: The prediction model was internally validated using 10-fold cross-validation.
10. The verification method according to claim 8, characterized in that, Also includes: A nomogram was used to verify the accuracy of the prediction results of the prediction model.