Abstract:
|
We seek to develop a variable screening and selection method for Bayesian mixture models with longitudinal data. We use data from the Health and Retirement Survey (HRS) conducted by University of Michigan. Considering yearly out-of-pocket expenditures as a longitudinal response variable, we consider a Bayesian mixture model with K components. The data consist of a large collection of demographic and health-related baseline characteristics, and we wish to find a subset of these that impact cluster membership. An initial mixture model without any cluster predictors is fit to the data through an MCMC algorithm, and then a variable screening step finds a set of candidate predictors that may be associated with the cluster configurations. For each predictor we choose a discrepancy measure such as Bayes Factors or frequentest hypothesis tests that measure the differences in the predictor values across clusters. A large discrepancy provides evidence that the clusters (and the corresponding response trajectories) differ across the baseline characteristic, and these are used to choose a small set of predictors to include in a multinomial probit model for cluster membership.
|