Abstract:
|
Liu, Hayes, Nobel and Marron (2008) introduced a hypothesis testing method (SigClust) for deciding whether a Gaussian should be split into two clusters. When applied recursively, this defines a method for hierarchical clustering that comes equipped with a significance guarantee. In this paper, we improve on this procedure. First we study the power of SigClust and we show that the method can have very low power. We define a new test, RIFT (Relative Information Fit Test), that has higher power. Then we show that the method provably finds the correct clustering structure under certain conditions.
|