Abstract:
|
We present a novel clustering approach for analyzing multivariate environmental exposures and health outcomes in cohorts. One application is air pollution epidemiology, where multi-pollutant data are available from regulatory monitoring networks but monitors are not located at cohort locations. We present a clustering method, predictive k-means, that incorporates geographic covariates to identify clusters and predict cluster membership at cohort locations. This procedure can be derived as a mixture of normal distributions and is solved using a version of the EM algorithm. We compare this approach to k-means clustering followed by spatial prediction. In simulations, we demonstrate that predictive k-means can reduce prediction error by over 50% compared to k-means, with minimal loss in cluster representativeness. In the NIEHS Sister Study cohort, we find that the association between systolic blood pressure (SBP) and long-term fine particulate matter (PM2.5) exposure varies significantly between different clusters of PM2.5 component profiles. For PM2.5 with low particulate nitrate fractions, a 10 ug/m3 difference in PM2.5 is associated with 3.28 mmHg (95% CI, 1.76, 4.81) higher SBP.
|
ASA Meetings Department
732 North Washington Street, Alexandria, VA 22314
(703) 684-1221 • meetings@amstat.org
Copyright © American Statistical Association.