Keywords: health economics, Bayesian statistics, Mixture Models, Dirichlet Process
Inpatient care is a large share of total health care spending, making analysis of inpatient utilization patterns an important part of understanding drivers of health care spending growth. Common features of inpatient utilization measures include zero-inflation, over-dispersion, and skewness, all of which complicate statistical modeling. Mixture modeling is a popular approach that can accommodate these features of health care utilization data. In this work, we add a nonparametric clustering component to such models. Our fully Bayesian model framework allows for an unknown number of mixing components, so that the data determine the number of mixture components. In simulation studies, we show that this model finds the true number of mixture components more accurately than using information criteria for model selection. When we apply the modeling framework to data on hospital lengths of stay for patients with lung cancer, we find distinct subgroups of patients with different means and variances of hospital days and different relationships between health and treatment variables and length of stay.