|Friday, February 21|
|CS06 Algorithmic Data Analyses||
Fri, Feb 21, 11:00 AM - 12:30 PM
Random Forest Procedure for Classification of Etiologies in Acute Liver Failure Patients (302734)Valerie L Durkalski, Medical University of South Carolina
William M Lee, University of Texas Southwestern Medical Center
*Jaime Lynn Speiser, Medical University of South Carolina
Keywords: Random forest, classification, acute liver failure, etiology
Classification of objects into pre-specified groups based on known information is a fundamental problem in the field of statistics. Many approaches for solving this problem exist; however, finding an accurate classification model can be challenging in some settings. In this study, the classification of acute liver failure (ALF) patients into etiology categories based on hospital admission data is investigated. We apply the random forest (RF) procedure for the prediction of etiology to the NIDDK-funded ALF Study Group registry data. Statistical classification of etiology, or cause, of ALF is challenging because of imbalanced data, large amounts of missing data, and correlated predictors. RF was selected as the procedure for this study because it can address these challenges. Though RF offers improved accuracy compared to other methods, presenting and interpreting information from the model can be difficult. The purpose of this study is to explore how the RF procedure may be used to improve diagnostic determinations of ALF etiologies. We present typical questions that arise in the general framework of classification and offer interpretations of results from the RF model.