Keywords: Hospital Readmission Rates, Machine Learning, Predictive Modeling, Diabetes
Hospital readmission rate is an important measurement of quality of health care and a major contributing factor of total medical expenditures. Diabetes is one of the chronic diseases associated with high hospital readmission rates and pre-identifying diabetic patients with high risk could potentially prevent future readmission. The main objective of this study is to build accurate predictive models for early hospital readmission (within 30 days after being discharged from the hospital) of diabetic patients and to identify key risk factors of readmission. After preparing the dataset in a proper manner which was obtained from UCI machine learning repository, different classical methods and machine learning techniques for classification were applied to predict early readmission risk probabilities. In particular, the performance of statistical methods such as generalized logistic, generalized additive and mixed models were compared to machine learning algorithms such as gradient boosting, random forests and support vector classification. A substantial improvement in the predictive power was achieved from a class weighted random forest classifier for the imbalanced dataset. Number of inpatient visits, discharge disposition, number of lab procedures and medications, and time in hospital were identified as most predictive features of early readmission rates.