JSM 2013 Home
Online Program Home
My Program

Abstract Details

Activity Number: 318
Type: Contributed
Date/Time: Tuesday, August 6, 2013 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistical Learning and Data Mining
Abstract - #307639
Title: Unsupervised Learning: Assessing Cluster Significance Through a Combination of Cross-Validation and Resampling
Author(s): Werner Stuetzle*+
Companies: University of Washington
Keywords: Cluster analysis ; Nonparametric clustering ; Single linkage ; Mode estimation ; Cross-validation ; Resampling
Abstract:

The goal of clustering is to detect the presence of distinct groups in a data set and assign group labels to the observations. Nonparametric clustering is based on the premise that the observations may be regarded as a sample from some underlying density in feature space and that groups correspond to modes of this density. We use Generalized Single Linkage (GSL) clustering (Stuetzle and Nugent, JCGS Vol 19, No. 2, 2010, pp. 397--418) as our clustering method. The question then arises whether clusters in the data suggested by GSL indeed correspond to distinct modes of the underlying density or can be attributed to sampling variability. We propose a heuristic based on a combination of cross-validation and resampling to answer this question, and we present the results of Monte Carlo experiments assessing the level and power of our method.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2013 program




2013 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Continuing Education program, please contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

ASA Meetings Department  •  732 North Washington Street, Alexandria, VA 22314  •  (703) 684-1221  •  meetings@amstat.org
Copyright © American Statistical Association.