![IconGems-Print](images/IconGems-Print.png)
552 – Contributed Oral Poster Presentations: Section on Statistical Learning and Data Mining
Choosing the Number of Clusters in Monothetic Clustering
Tan V. Tran
Montana State University
Mark C. Greenwood
Montana State University
Monothetic clustering is a divisive clustering method based on recursive bipartitions of the data set determined by choosing splitting rules from any of the variables to conditionally optimally partition the multivariate responses. Like in other clustering methods, the choice of the number of clusters is important in this method. Connections between monothetic clustering and decision trees motivate the consideration of pruning methods as aids in selecting the number of clusters. We apply different cross-validation techniques to find the number of clusters that optimize prediction error and compare that approach to ermutation-based hypothesis tests at each bi-splitting step, retaining splits with "small" p-values. A simulation study is performed to evaluate the performance of the new methods and compare to some other existing techniques.