Online Program

Friday, February 21
CS06 Algorithmic Data Analyses Fri, Feb 21, 11:00 AM - 12:30 PM
Bayshore VI

Random Forest Procedure for Classification of Etiologies in Acute Liver Failure Patients (302734)

Valerie L Durkalski, Medical University of South Carolina 
William M Lee, University of Texas Southwestern Medical Center 
*Jaime Lynn Speiser, Medical University of South Carolina 

Keywords: Random forest, classification, acute liver failure, etiology

Classification of objects into pre-specified groups based on known information is a fundamental problem in the field of statistics. Many approaches for solving this problem exist; however, finding an accurate classification model can be challenging in some settings. In this study, the classification of acute liver failure (ALF) patients into etiology categories based on hospital admission data is investigated. We apply the random forest (RF) procedure for the prediction of etiology to the NIDDK-funded ALF Study Group registry data. Statistical classification of etiology, or cause, of ALF is challenging because of imbalanced data, large amounts of missing data, and correlated predictors. RF was selected as the procedure for this study because it can address these challenges. Though RF offers improved accuracy compared to other methods, presenting and interpreting information from the model can be difficult. The purpose of this study is to explore how the RF procedure may be used to improve diagnostic determinations of ALF etiologies. We present typical questions that arise in the general framework of classification and offer interpretations of results from the RF model.