Abstract:
|
Dietary intake is a major modifiable risk factor for cardiovascular disease. We can characterize dietary intake patterns and their effects on risk of cardiovascular disease by using supervised Bayesian nonparametric clustering methods. However, when data are sourced from surveys where unequal probabilities of selection are inherent in the design, this complex survey design must be accounted for to avoid biased estimation and inference. Working from an overfitted finite mixture model framework, we explore two approaches that use sampling weights to adjust for survey design and apply them to a supervised cluster setting. The first approach replaces the likelihood with a weighted pseudo-likelihood in the posterior update. The second approach uses a weighted finite population Bayesian bootstrap to generate a pseudo-population, which is then integrated into the Markov chain Monte Carlo algorithm. Using categorical dietary consumption data and binary cardiovascular disease data from representative surveys, we apply these two methods and discuss their performance via simulation studies in an effort to better understand the impact of diet on cardiovascular disease risk.
|