|
Activity Number:
|
262
|
|
Type:
|
Topic Contributed
|
|
Date/Time:
|
Tuesday, August 4, 2009 : 8:30 AM to 10:20 AM
|
|
Sponsor:
|
Section on Bayesian Statistical Science
|
| Abstract - #304162 |
|
Title:
|
Distance-Based Probability Distribution on Set Partitions with Applications to Bayesian Nonparametrics
|
|
Author(s):
|
David B. Dahl*+ and Ryan Day and Jerry Tsai
|
|
Companies:
|
Texas A&M University and University of the Pacific and University of the Pacific
|
|
Address:
|
3143 TAMU, College Station, TX, 77843-3143,
|
|
Keywords:
|
Bayesian nonparametrics ; clustering methods ; Dirichlet process mixture model ; non-exchangeable priors ; partition models
|
|
Abstract:
|
Methods that integrate several types of data into a single model are increasingly necessary. Clustering methods are typically either model-based or distance-based, but we propose a method that is both. The Dirichlet process induces a clustering distribution where the probability an item is clustered with another is uniform across all items. We provide an extension which incorporates distance information to yield a distribution for partitions that is indexed by pairwise distances. We define a new class of Bayesian nonparametric models that utilizes this distance-based probability distribution over partitions as a prior clustering distribution. We apply this method to a model for protein structure prediction and find that it substantially improves predictive accuracy.
|