Abstract:
|
In order to identify clusters of objects with features of a general covariance structure, we develop a model-based cluster process using Gaussian models with general covariance structures of the feature space and able to cluster data without knowing the number of clusters in advance. Our proposed method can identify clusters in model I: the features are independent and have the same variances; in model II: the features are independent but have different variances; in model III: the features are dependent and have different variances. The proposed split-merge algorithm leads to an irreducible and aperiodic Markov chain, which is also efficient at identifying clusters reasonably well for various applications. We illustrate the applications of our approach to both synthetic and real data such as leukemia gene expression data for model I; wine data and two half-moons benchmark data for model II; three-dimensional Denmark road network data and an arbitrary non-singular transformed two half-moons data for model III.
|