JSM 2017 Online Program

Activity Number:	529 - SPEED: Machine Learning
Type:	Contributed
Date/Time:	Wednesday, August 2, 2017 : 10:30 AM to 11:15 AM
Sponsor:	Section on Statistical Learning and Data Science
Abstract #325311
Title:	Statistical Significance of Clustering
Author(s):	Purvasha Chakravarti* and Larry Wasserman and Sivaraman Balakrishnan
Companies:	Carnegie Mellon University and Carnegie Mellon and Department of Statistics, CMU
Keywords:	Clustering ; k means ; Power ; Hypothesis Testing
Abstract:	In clustering, it is critical to separate real clusters from noise. Liu et al (2012) proposed an approach based on iterative hypothesis testing. We study the theoretical properties of their procedure. In particular, we find the asymptotic limiting distribution of the test, which allows us to characterize the power. We also consider some simulated examples.

Authors who are presenting talks have a * after their name.