Keywords: prospective healthcare, machine learning, risk prediction, HIV, BART, XGBOOST, GBM, stacked ensembles
A major theme in modern medicine is prospective healthcare, which refers to the capability to estimate the future medical risks for patients with the goal of achieving maximum health impact. Such capability would facilitate planning appropriate treatment pathways or preventing adverse clinical trajectories through timely interventions. HIV viral load (VL) monitoring is the current gold standard for assessing both adherence and/or response to antiretroviral therapy (ART). Virologic failure occurs when ART fails to suppress a person’s viral load to undetectable levels. Timely prediction of virologic failure of susceptible patients is critical in reducing HIV transmission as well as preventing patient-level clinical failure through proactive administration of targeted adherence monitoring and/or drug resistance monitoring. Studies have shown that elevated viral loads both increases morbidity for patients and creates the potential for HIV transmission. As patients with virologic failure are at elevated risk of transmission, delays in efforts to prognosticate and address non-suppression create a significant public health risk. As a step towards personalised medicine in HIV care and treatment, modern machine learning models were evaluated in order to build accurate risk predictive models, which would potentially be used to anticipate and mitigate risks of virologic failure. Bayesian additive regression trees, penalized logistic regression, gradient boosting framework and stacked ensemble techniques were among the methods which exhibited the highest cross-validated sensitivity and specificity. These models were trained and evaluated using de-identified dataset, extracted from an EHR serving over 90,000 patients in Western Kenya.