|
Activity Number:
|
340
|
|
Type:
|
Contributed
|
|
Date/Time:
|
Tuesday, July 31, 2007 : 2:00 PM to 3:50 PM
|
|
Sponsor:
|
Section on Statistical Computing
|
| Abstract - #308540 |
|
Title:
|
Smoothing the Dissimilarities Among Binary Data for Cluster Analysis
|
|
Author(s):
|
David Hitchcock*+ and Zhimin Chen
|
|
Companies:
|
University of South Carolina and University of South Carolina
|
|
Address:
|
Department of Statistics, Columbia, SC, 29208,
|
|
Keywords:
|
binary data ; cluster analysis ; smoothing ; shrinkage ; dissimilarity
|
|
Abstract:
|
Cluster analysis attempts to group data objects into homogeneous clusters on the basis of the pairwise dissimilarities among the objects. When the data contain noise, we might consider performing a smoothing operation, either on the data themselves or on the dissimilarities, before implementing the clustering algorithm. Possible benefits to such pre-smoothing are discussed in the context of binary data. We suggest a method for cluster analysis of binary data based on ``smoothed'' dissimilarities. The smoothing method presented borrows ideas from shrinkage estimation of cell probabilities. Some initial results are given, and some future avenues in this area are outlined. The method is illustrated with an example involving binary item response data.
|