Keywords: Covariate adjustment, Clinical Trials, Machine Learning
The gold standard approach to designing a clinical trial is a randomized control trial with treatment and control arms. However, these methods are not without confounding issues, most notably imbalances that can lead to type I or type II errors. An approach to minimizing these errors is to implement covariate adjustment when assessing treatment effects. The primary requirement for any adjustment protocol is the presence of a prognostic covariate a priori. While there are many diseases for which prognostic covariates are available, there is a subset of highly heterogeneous, multifactorial, or idiopathic diseases for which a readily available covariates are lacking. In particular neurodegenerative diseases such as Alzheimer’s Disease (AD) and Amyotrophic Lateral Sclerosis (ALS) are representative diseases in which univariate biomarkers have met with limited statistical success. Recent strides in machine learning coupled with the prevalence of publicly available aggregates of clinical trials data present an opportunity to derive meaningful prognostic covariates for diseases that would not otherwise be amenable to covariate adjusted analyses. We present work detailing the application of a novel machine learning model developed for ALS to create a prediction-based covariate and demonstrated covariate adjustment simulations. Further, We demonstrate that by utilizing a newly standardized approach to data cleaning, analysis, and model development, we can rapidly deploy a similarly effective biomarker for AD derived from a publicly available clinical trials dataset containing records from clinical trials to train predictive models that result in validated prognostic covariates. The ability to rapidly deploy machine learning-based predictive covariates in new disease areas represents a novel step forward in applying these powerful new tools more broadly in the clinical trials space.