Online Program Home
My Program

Abstract Details

Activity Number: 552
Type: Contributed
Date/Time: Wednesday, August 3, 2016 : 10:30 AM to 12:20 PM
Sponsor: Section on Bayesian Statistical Science
Abstract #321467
Title: Efficient Bayesian Posterior Sampling for Massive Data Sets
Author(s): Reihaneh Entezari* and Radu V. Craiu and Jeffrey S. Rosenthal
Companies: and University of Toronto and University of Toronto
Keywords: Markov Chain Monte Carlo ; Bayesian Additive Regression Trees ; Importance resampling ; Consensus Monte Carlo

Massive datasets are computationally expensive in Bayesian posterior sampling since most Markov Chain Monte Carlo (MCMC) methods need at least O(N) operations to draw one sample (N being the number of data that is huge). In this paper, we will present a new posterior sampling method for big data applications that randomly divides the dataset into subsets of data and runs independently parallel MCMC methods on each subset using different processors. Each processor will draw samples from a predefined distribution given a subset of data and all samples from all processors are then combined using the importance resampling method to perform full-data posterior samples. We apply our method to the Bayesian Additive Regression Trees (BART) model and observe better performance compared to an alternative method, the "Consensus Monte Carlo". Our method performs better in terms of posterior sample approximations as well as run time efficiency. Furthermore, we will apply a modification to our method for BART that significantly improves posterior sampling and unlike Consensus Monte Carlo, it generates posterior distributions that are indistinguishable from the full-data posterior.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

Copyright © American Statistical Association