Abstract:
|
Gaussian processes and their extensions are widely used models in a variety of fields such as statistics and machine learning. To achieve computational feasibility for large datasets, there have been various approximations proposed in literature. One of the approaches is the Vecchia approximation, which implicitly assumes a sparse Cholesky factor of the precision matrix. For this purpose, it is required to order locations, for instance using a space-filling maximum-minimum-distance algorithm, and then construct sparsity structure such as nearest-neighbor conditioning. Both of them are typically carried out based on Euclidean distance. Here we propose instead to use a correlation-based distance metric. The Euclidean- and correlation-based approaches are equivalent for isotropic covariances, but the correlation-based approach has two advantages for more complex situations: It can result in more accurate approximations, and it offers a simple, automatic strategy even when Euclidean distance is not applicable. We illustrate our method with simulated data as well as real data.
|