Abstract:
|
We develop a clustering algorithm which does not requires knowing the number of clusters in advance. Furthermore, our clustering method is rotation-, scale- and translation-invariant coordinatewise. We call it "Affine-invariant Bayesian (AIB) process". A highly efficient split-merge Gibbs sampling algorithm is proposed. Using the Ewens sampling distribution as prior of the partition and the profile residue likelihoods of the responses under three different covariance matrix structures, we obtain inferences in the form of a posterior of the partition. The Gibbs sampling has two stages: 1) given a partition the covariance ratio parameter is generated from its posterior distribution and a Gibbs sampler determines acceptance; 2) a partition matrix is generated from the split-merge algorithm and accepted or not by the Gibbs sampler given the covariance parameter from Stage 1. The resulting Markov chain of the partition is irreducible and aperiodic. Our experiments show that the estimates of the partition posterior probabilities can reflect the true blocks by using a heat map or a distance-based tree on the partition estimate.
|