Online Program Home
My Program

Abstract Details

Activity Number: 539 - SPEED: Bayesian Methods and Applications in the Life and Social Sciences
Type: Contributed
Date/Time: Wednesday, August 1, 2018 : 11:35 AM to 12:20 PM
Sponsor: Section on Bayesian Statistical Science
Abstract #332788
Title: Blocking Collapsed Gibbs Sampler for Latent Dirichlet Allocation Models
Author(s): Xin Zhang* and Scott Sisson
Companies: Pfizer (China) Research and Development Co., Ltd. and University of New South Wales
Keywords: blocking; Gibbs sampling; latent Dirichlet allocation; collapsing; nested simulation

The latent Dirichlet allocation (LDA) model is a widely-used latent variable model in machine learning for text analyses. Inference for this model typically involves a single-site collapsed Gibbs sampling step for latent variables associated with observations. The efficiency of the sampling is critical to the success of the model in practical large scale applications. In this article, we introduce a blocking scheme to the collapsed Gibbs sampler for the LDA model which can, with a theoretical guarantee, improve chain mixing efficiency. We develop an O(log K)-step nested simulation, to directly sample the latent variables within each block. We demonstrate that the blocking scheme achieves substantial improvements in chain mixing compared to the state of the art single-site collapsed Gibbs sampler. We also show that when the number of topics is large, the blocking scheme can achieve a significant reduction in computation time compared to the single-site sampler.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program