Abstract:
|
With increasing amounts of data being produced (e.g., by remote sensing instruments and numerical models), the statistical and computational techniques to handle these data sizes of millions of observations have historically lagged behind. While a variety of statistical methods have been developed theoretically to tackle this problem, readily available computational implementations that work with irregularly-spaced observations are still rare. We introduce a set of computational implementations for the Multi-resolution Approximation (MRA), a recently developed spatial statistical method that lends itself particularly well to massive parallelization. The implementations range from having fairly simple parallelization strategies, targeting small computing units such as laptops, to sophisticated implementations in C++ with OpenMP and MPI leveraging high performance computing infrastructure. We show and compare results for millions of observations and discuss practical challenges that arise with such massive data sets.
|