Abstract:
|
The Dirichlet Process Mixture Model (DPMM), a Bayesian method for unsupervised clustering, is currently the most utilized semiparametric model based approach for cases, when the number of clusters is unknown. By assigning the Dirichlet process prior in the DPMM remedies some of the drawbacks in using fully parametric priors in Bayesian clustering analyses. This prior can be parameterized by the strength parameter and the base measure, both of which influence inferences about the clustering results. Of primary importance is the strength parameter, which controls the expected number of clusters. An estimation of the strength parameter is critical, but the approach for estimating this parameter is specified in a rather cavalier manner. We propose a new method for determining the strength parameter which is based not only on the number of clusters defined, but also on the shape and configuration of these clusters. We illustrate the concept of the Total Variation Distance between clusters to calibrate with the strength parameter. Then, we propose our method for updating the strength parameter to define the cluster structure, which is easy to implement in a Gibbs sampler.
|
ASA Meetings Department
732 North Washington Street, Alexandria, VA 22314
(703) 684-1221 • meetings@amstat.org
Copyright © American Statistical Association.