Activity Number:
|
462
- Spatio-Temporal Methods for Complex Data
|
Type:
|
Invited
|
Date/Time:
|
Wednesday, August 10, 2022 : 2:00 PM to 3:50 PM
|
Sponsor:
|
Royal Statistical Society
|
Abstract #320452
|
|
Title:
|
Principled Spatial Machine Learning with Random Forests and Gaussian Processes
|
Author(s):
|
Abhi Datta* and Arkajyoti Saha and Sumanta Basu
|
Companies:
|
Johns Hopkins University and University of Washington and Cornell University
|
Keywords:
|
Gaussian Process;
Spatial statistics;
Machine learning;
Random Forests
|
Abstract:
|
Spatial linear mixed-models, consisting of a linear covariate effect and a Gaussian Process (GP) distributed spatial random effect, are widely used for analyses of geospatial data. We consider the setting where the covariate effect is non-linear. Random forests (RF) are popular for estimating non-linear functions but applications of RF for spatial data have often ignored the spatial correlation or treated it in a brute-force manner. We show that these choices impact the performance of RF adversely. We propose RF-GLS, a novel and well-principled extension of RF, for estimating non-linear covariate effects in spatial mixed models where the spatial correlation is modeled using GP. RF-GLS extends RF in the same way generalized least squares (GLS) fundamentally extends ordinary least squares (OLS) to accommodate for dependence in linear models. RF-GLS can be used for functional estimation in other types of dependent data like time series. We provide extensive theoretical and empirical support for RF-GLS. We also demonstrate the RandomForestsGLS CRAN R-package for analyzing spatial data using RF-GLS.
|
Authors who are presenting talks have a * after their name.