The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.
Online Program Home
Abstract Details
Activity Number:
|
402
|
Type:
|
Contributed
|
Date/Time:
|
Tuesday, July 31, 2012 : 2:00 PM to 3:50 PM
|
Sponsor:
|
Section on Statistical Learning and Data Mining
|
Abstract - #306853 |
Title:
|
On the Estimation of Similarity Indices in Clustering Evaluation
|
Author(s):
|
John Ramey*+
|
Companies:
|
Baylor University
|
Address:
|
Department of Statistical Science, Waco, TX, 76798-7140, United States
|
Keywords:
|
unsupervised learning ;
clustering evaluation ;
clustering similarity ;
Rand index ;
Jaccard index
|
Abstract:
|
The evaluation of clustering algorithms has been argued to be as important as the actual clustering, yet evaluation methods have not been well-studied and are not as straightforward as the evaluation of supervised learning models. In practice, clusters are generally assessed via subjective judgment of visualization tools that are prone to oversimplify geometric structure in the data and can be misleading as well as difficult to interpret. Several proposed evaluation methods have utilized similarity coefficients, such as the Rand and Jaccard indices, to compare clusters from candidate clustering methods, often combined with resampling techniques. We show that the common approach to estimate these similarity coefficients via contingency tables is naive and yields extremely biased estimators, which can lead to invalid conclusions about the determined clusters. We present a Bayesian approach to estimate the similarity coefficients based on a more reasonable likelihood and demonstrate that this alternative approach improves the similarity coefficient estimation, thereby improving the clustering assessment.
|
The address information is for the authors that have a + after their name.
Authors who are presenting talks have a * after their name.
Back to the full JSM 2012 program
|
2012 JSM Online Program Home
For information, contact jsm@amstat.org or phone (888) 231-3473.
If you have questions about the Continuing Education program, please contact the Education Department.