2015 Joint Statistical Meetings - Statistics: Making Better Decisions.

JSM 2015 Home

JSM 2015 Online Program

Abstract Details

Activity Number:	660
Type:	Contributed
Date/Time:	Thursday, August 13, 2015 : 8:30 AM to 10:20 AM
Sponsor:	Section on Statistical Computing
Abstract #317638
Title:	K-Means Clustering with Missing Data
Author(s):	Juwon Song*
Companies:	Korea University
Keywords:	k-means clustering ; missing data ; imputation
Abstract:	K-means clustering is one of the most popular analysis techniques to divide observations into several groups. K-means clustering is simple and easy to implement for completely observed data. When data include missing values, it is often to impute missing values in the preprocessing stage of k-means clustering, but it would ignore membership information when we impute missing values. To overcome this disadvantage, it has been suggested to impute missing values using the observed mean or nearest neighbors when we calculate distances between observed values and group means. On the other hand, the K-means clustering can be considered as a special case of clustering based on finite mixture models. We consider an extension of the finite mixture model by using the EM algorithm to handle missing data in K-means clustering. A simulation is conducted to evaluate the performance of the suggested method and compare it with other commonly used approaches to handle missing data in K-means clustering.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2015 program

For program information, contact the JSM Registration Department or phone (888) 231-3473.

For Professional Development information, contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

2015 JSM Online Program Home