Online Program Home
  My Program

Abstract Details

Activity Number: 669 - Recent Advances in Nonparametric Statistics
Type: Contributed
Date/Time: Thursday, August 3, 2017 : 10:30 AM to 12:20 PM
Sponsor: Section on Nonparametric Statistics
Abstract #323575
Title: A Latent Trait Clustering Method for Mixed-Mode Data
Author(s): Yawei Liang* and David Hitchcock
Companies: University of South Carolina and University of South Carolina
Keywords: clustering ; mixed-mode data ; latent variable

Difficulties in clustering mixed-mode data include handling the association between the different types of variables and determining how the continuous and categorical variables should be weighted in the algorithm. We follow the multivariate normal model to deal with such data by assuming latent continuous variables with thresholds defining categories for the categorical variables. We propose a new method to generate realizations of latent variables corresponding to observed categorical variables. We then apply k-means clustering on the observed and generated continuous data. This new method, called the latent realization clustering method, depends on the Kendall rank correlation coefficient between variables of different types. When applied to simulated data, this method performs less accurate than the mixture model based clustering method but takes much less time. Additionally, we use the variation in our latent data realizations to produce estimated probabilities of each observation belonging to each cluster and a probability matrix that estimates the probability of each pair of observations falling in the same cluster.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2017 program

Copyright © American Statistical Association