Abstract:
|
We develop the Wedding Tables Process (WTP), an exchangeable distribution on partitions, and the Exchangeable Bayesian Matrix Hierarchical Clustering (EBMHC) method for matrix-variate data. EBMHC efficiently samples clusters via the full conditional distribution on partitions, and its sampled parameters converge almost surely. WTP facilitates parallel processing via its exchangeability. A BIC-like measure called aBIC is derived to determine the optimal number of clusters K*. EBMHC produces a posterior similarity matrix where each element is the posterior probability of two matrices belonging to the same cluster. We propose a clustering tree algorithm to sort the subject indices. Consequently, an ordered index based on the clustering tree is obtained to realign indices so that K* clusters are revealed by the block diagonal structure, and a dendrogram is constructed. Ties are resolved using prior information or a multi-way split in the tree. A one-component graph with minimum entropy is proposed to visualize the clusters’ topology. Non-informative or data-based hyperparameter suggestions are presented. EBMHC has been successfully applied to simulation and real-world data.
|