JSM 2015 Online Program

Online Program Home
My Program

Abstract Details

Activity Number: 190
Type: Contributed
Date/Time: Monday, August 10, 2015 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistical Learning and Data Mining
Abstract #315319 View Presentation
Title: A Semiparametric Method for Clustering Mixed Data
Author(s): Alexander Foss* and Marianthi Markatou and Aliza Heching and Bonnie K. Ray
Companies: SUNY Buffalo and SUNY Buffalo and IBM T.J. Watson Research Center and IBM T.J. Watson Research Center
Keywords: Cluster analysis ; Unsupervised Learning ; Mixed Data ; Measurement Error

In spite of the existence of a large number of clustering algorithms, clustering remains a challenging problem. As large datasets become increasingly common in a number of different domains, it is often the case that clustering algorithms must be applied to heterogeneous sets of variables, creating an acute need for robust and scalable clustering methods for mixed continuous and categorical data. We show that current clustering methods for mixed data suffer from at least one of two central challenges: (1) they are unable to equitably balance the contribution of continuous and categorical variables without strong parametric assumptions; or (2) they are unable to properly handle data sets in which only a subset of variables are related to the underlying cluster structure of interest. We develop KAMILA (KAy-means for MIxed LArge data), a clustering method that addresses these challenges without requiring strong parametric assumptions. We demonstrate the superiority of our method in a series of Monte Carlo simulation studies and a real-world application.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2015 program

For program information, contact the JSM Registration Department or phone (888) 231-3473.

For Professional Development information, contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

2015 JSM Online Program Home