Thursday, October 19

Thu, Oct 19, 2:45 PM - 3:50 PM
Aventine Ballroom E

Speed Session 1

Probabilistic Predictive Principal Component Analysis for Spatially Misaligned and High-Dimensional Air Pollution Data (303779)

Adam A Szpiro, University of Washington
*Phuong T Vu, University of Washington

Keywords: air pollution, dimension reduction, principal component analysis, missing data, latent variable model, spatial misalignment, universal kriging

Environmental studies often focus on the health impacts of long-term air pollution exposure on human subjects. Pollutant concentrations are measured at regulatory monitoring locations, which are usually located at different locations than the study subjects. This spatial misalignment motivates a two-stage modeling approach with an exposure model and a health regression model. In addition, air pollution is often a mixture of many components with different health implications. Conventional approaches incorporate techniques such as principal component analysis (PCA) to obtain a lower-dimensional representation of the data. Recently developed predictive PCA modifies the optimization criterion to improve the exposure model. However, these approaches require complete data. Real-world data tend to have complex missing patterns, including some pollutants that are measured at relatively few locations and some locations with many missing measures. We propose a probabilistic version that allows for flexible imputation to utilize all available monitoring data. We demonstrate the performance of probabilistic predictive PCA with simulations and analysis of multivariate air pollution data.

Online Program

Probabilistic Predictive Principal Component Analysis for Spatially Misaligned and High-Dimensional Air Pollution Data (303779)

American Statistical Association

Share