Online Program

Return to main conference page

All Times EDT

Thursday, June 4
Data Visualization
Divide and Recombine for Big Data Analysis and Visualization
Thu, Jun 4, 1:20 PM - 2:55 PM
TBD
 

Distributed Bayesian Varying Coefficient Modeling Using a Gaussian Process Prior (308171)

*Sanvesh Srivastava, University of Iowa 

Keywords: data augmentation, divide-and-conquer Bayesian inference, Gaussian process priors, varying coefficients models

Bayesian varying coefficients models are widely used for estimating nonlinear regression functions that are easy to interpret. Our focus is on the models based on Gaussian process (GP) priors that are very flexible but their applications in massive data settings are limited. This is mainly due to the prohibitively slow Markov chain Monte Carlo (MCMC) computations for posterior inference. We develop a parameter-expanded data augmentation-type algorithm based on the divide-and-conquer (D&C) technique that is easily scaled to large data sets by forming many data subsets using subsampling, performing posterior computations in parallel on the subsets, and combining the posterior samples obtained from the subsets. The combined MCMC samples replace MCMC samples from the full data posterior distribution for posterior inference. Based on the optimal convergence rate of the D&C posterior distribution and analytic properties of GPs, we also provide guidance for choosing the number of subsets.