Abstract:
|
We analyze the spatial predictability of 14 traffic-related pollutants using the data collected by the Center for Clean Air Research (CCAR) in the summer and winter of 2013 in Baltimore, MD. Our goal is to understand the spatial predictability of these pollutants using geographical information system (GIS) covariates. We present a comparison of four potential prediction approaches for spatial data. The first two approaches are based on universal kriging (UK). In the first approach, we model the mean structure in UK using a lower dimensional representation of the GIS covariates via partial least squares (PLS). In the second approach, we perform a two-step variable selection: the Elastic-net penalized regression followed by a search for the best subset of variables with the lowest Mallow's Cp score. We then use the selected variables in UK. In the third approach, as an alternative to UK, we apply a random forest algorithm using the GIS covariates and thin-plane splines. Finally, in the fourth approach, we predict the pollutants via Bayesian additive regression trees (BART), a Bayesian version of random forests.
|