Online Program Home
My Program

Abstract Details

Activity Number: 526 - Bayesian Clustering and Variable Selection
Type: Contributed
Date/Time: Wednesday, August 1, 2018 : 10:30 AM to 12:20 PM
Sponsor: Section on Bayesian Statistical Science
Abstract #328992 Presentation
Title: Learning the Number of Components and Data Clusters in Bayesian Finite Mixture Models
Author(s): Bettina Grün* and Gertraud Malsiner-Walli and Sylvia Frühwirth-Schnatter
Companies: Johannes Kepler Universität and Wirtschaftsuniversität Wien and Wirtschaftsuniversität Wien
Keywords: Model-based clustering; Finite mixture; Number of components; MCMC; Sparse prior; Data cluster

Bayesian cluster analysis aims at inferring the number of data clusters present in a data set using either finite or infinite mixture models. In Bayesian finite mixture models usually a one-to-one relationship between components and data clusters is assumed. The number of components can be determined by comparing the marginal likelihoods of the potential models or by approximating the posterior of the number of components using different methods, e.g., reversible jump MCMC, Markov birth-and-death process sampling, or the Jain-Neal split-merge sampler.

We propose to explicitly distinguish between the number of data clusters and components and purposely allow for more components than data clusters. We extend the standard approach by including priors on the number of components and on the Dirichlet parameter. This allows us to approximate the posteriors of the number of components as well as data clusters using Gibbs sampling techniques. The performance of the proposed sampling technique is compared to previously proposed approaches. The additional flexibility gained by suitably selecting the parameters of the hyperpriors is highlighted and guidance for their choice provided.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program