JSM 2015 Preliminary Program

Online Program Home
My Program

Abstract Details

Activity Number: 614
Type: Contributed
Date/Time: Wednesday, August 12, 2015 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistical Learning and Data Mining
Abstract #315647
Title: Clustering of High-Dimensional Categorical Data
Author(s): Saeid Amiri* and Bertrand Clarke and Jennifer Clarke
Companies: University of Nebraska - Lincoln and University of Nebraska - Lincoln and University of Nebraska - Lincoln
Keywords: Clustering ; categorical data ; ensemble methods ; high dimensional data
Abstract:

Here, we propose an ensemble approach to clustering categorical data. The proposed ensemble method is based on hierarchical clustering under average linkage. We give a rationale for why our procedure does well in low dimensions. This is supported by extensive computational comparisons with other methods using simulated and real data. Our method for low dimensional categorical data extends to high dimensional categorical data by using an extra level of ensembling. This minimizes the effect of the Curse of Dimensionality that tends to equalize the distances between any two points as dimension increases. A further extension of our ensembling method permits the vectors of categorical outcomes to have different dimensions.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2015 program





For program information, contact the JSM Registration Department or phone (888) 231-3473.

For Professional Development information, contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

2015 JSM Online Program Home