Online Program

Return to main conference page
Thursday, October 19
Community
Influence
Knowledge
Thu, Oct 19, 5:00 PM - 7:00 PM
Aventine Ballroom G
Opening Mixer, Speed Poster 1, and Speed Mentoring sponsored by Wiley

Probabilistic Predictive Principal Component Analysis for Spatially-Misaligned and High-Dimensional Air Pollution Data (304025)

*Phuong T Vu, University of Washington 

Environmental studies often focus on the health impacts of long-term air pollution exposure on human subjects. Pollutant concentrations are measured at regulatory monitoring locations, which are usually located at different locations than the study subjects. This spatial misalignment motivates a two-stage modeling approach with an exposure model and a health regression model. In addition, air pollution is often a mixture of many components with different health implications. Conventional approaches incorporate techniques such as principal component analysis (PCA) to obtain a lower-dimensional representation of the data. Recently developed predictive PCA modifies the optimization criterion to improve the exposure model. However, these approaches require complete data. Real-world data tend to have complex missing patterns, including some pollutants that are measured at relatively few locations and some locations with many missing measures. We propose a probabilistic version that allows for flexible imputation to utilize all available monitoring data. We demonstrate the performance of probabilistic predictive PCA with simulations and analysis of multivariate air pollution data.