Abstract:
|
Count data is very common and Poisson, Negative Binomial distributions with a log-linear model are often used to model it. In recent years, zero-inflated Poisson(ZIP)/zero-inflated Negative Binomial(ZINB) have been used to model count data when there are excessive zero counts(zero-inflation). Hurdle model has been developed to model count data where zero counts could be inflated/deflated. However, when data is repeatedly collected over time and it also has clustered structure, it is challenging to model it appropriately. Our interest is to study health professional shortage areas (HPSA) using longitudinally collected data from each county in the United States. The outcome variable is the number of MDs at each county over different years, and counties are both nested within state and considered to be geographically correlated. We develop a Bayesian hurdle model with multi-layered random effects structure to analyze the longitudinal and clustered count data, identifying important factors which impact the shortages. Dependence across years is incorporated by a time-varying random effect for each state, and a flexible spline model induces spatial correlation across counties.
|