Online Program

Return to main conference page
Friday, February 15
Fri, Feb 15, 5:15 PM - 6:30 PM
St. James Ballroom
Poster Session 2 and Refreshments

The Effect of Sampling Methods on Machine Learning Models for Predicting Long-Term Length of Stay: A Case Study for Rhode Island Hospitals (303920)

View Presentation View Presentation

*Son Nguyen, Bryant University 

Keywords: Imbalanced Data, Resampling, Length of Stay, Rare Event

The ability to predict the patients with long-term length of stay (LOS) can aid a hospital’s admission management, maintain effective resource utilization and provide a high quality of inpatient care. Hospital Discharge Data from the Rhode Island Department of Health between 2010-2013 reveals that inpatients with long-term stays, i.e. two weeks or more, cost about six times more than those with short stays while only accounting for 4.7% of the inpatients. With the imbalance in the distribution of long-stay patients and short-stay patients, predicting long-term LOS patients becomes an imbalanced classification problem. Sampling methods—balancing the data before fitting it to a traditional classification model—offer a simple approach to the problem. In this work, the authors propose a new resampling method called RUBIES which provides superior predictive ability when compared to other commonly-used sampling techniques.