Online Program Home
  My Program

Abstract Details

Activity Number: 332 - SPEED: Section on Bayesian Statistical Science
Type: Contributed
Date/Time: Tuesday, August 1, 2017 : 10:30 AM to 12:20 PM
Sponsor: Section on Bayesian Statistical Science
Abstract #323236 View Presentation
Title: Hyperparameter Selection for the Latent Dirichlet Allocation Model
Author(s): Wei Xia* and Hani Doss
Companies: University of Florida and University of Florida
Keywords: Empirical Bayes inference ; latent Dirichlet allocation ; Markov chain Monte Carlo ; model selection ; topic modelling
Abstract:

Latent Dirichlet Allocation (LDA) is a heavily-used Bayesian hierarchical model used in machine learning for modeling high-dimensional sparse count data, for example, text documents. As a Bayesian model, it incorporates a prior on a set of latent variables. The prior is indexed by some hyperparameters, which have a big impact on inference regarding the model. The ideal estimate of the hyperparameters is the empirical Bayes estimate which is, by definition, the maximizer of the marginal likelihood of the data with all the latent variables integrated out. This estimate cannot be obtained analytically. In practice, the hyperparameters are chosen either in an ad-hoc manner, or through some variants of the EM algorithm for which the theoretical basis is weak. We propose an MCMC-based fully Bayesian method for obtaining the empirical Bayes estimate of the hyperparameter. We compare our method with other existing approaches both on synthetic and real data. The comparative experiments demonstrate that the LDA model with hyperparameters specified by our method outperforms models with the hyperparameters estimated by other methods.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2017 program

 
 
Copyright © American Statistical Association