Abstract Details
Activity Number:
|
318
|
Type:
|
Contributed
|
Date/Time:
|
Tuesday, August 6, 2013 : 8:30 AM to 10:20 AM
|
Sponsor:
|
Section on Statistical Learning and Data Mining
|
Abstract - #307639 |
Title:
|
Unsupervised Learning: Assessing Cluster Significance Through a Combination of Cross-Validation and Resampling
|
Author(s):
|
Werner Stuetzle*+
|
Companies:
|
University of Washington
|
Keywords:
|
Cluster analysis ;
Nonparametric clustering ;
Single linkage ;
Mode estimation ;
Cross-validation ;
Resampling
|
Abstract:
|
The goal of clustering is to detect the presence of distinct groups in a data set and assign group labels to the observations. Nonparametric clustering is based on the premise that the observations may be regarded as a sample from some underlying density in feature space and that groups correspond to modes of this density. We use Generalized Single Linkage (GSL) clustering (Stuetzle and Nugent, JCGS Vol 19, No. 2, 2010, pp. 397--418) as our clustering method. The question then arises whether clusters in the data suggested by GSL indeed correspond to distinct modes of the underlying density or can be attributed to sampling variability. We propose a heuristic based on a combination of cross-validation and resampling to answer this question, and we present the results of Monte Carlo experiments assessing the level and power of our method.
|
Authors who are presenting talks have a * after their name.
Back to the full JSM 2013 program
|
2013 JSM Online Program Home
For information, contact jsm@amstat.org or phone (888) 231-3473.
If you have questions about the Continuing Education program, please contact the Education Department.
The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.
Copyright © American Statistical Association.