2014 Joint Statistical Meetings - Statistics: Global Impact - Past, Present and Future

JSM 2014 Online Program

Online Program Home
My Program

Activity Number:	34
Type:	Contributed
Date/Time:	Sunday, August 3, 2014 : 2:00 PM to 3:50 PM
Sponsor:	Health Policy Statistics Section
Abstract #312555	View Presentation
Title:	Clustering Incomplete Data Using Normal Mixture Models
Author(s):	Chantal Larose*+ and Dipak Dey and Ofer Harel
Companies:	University of Connecticut and University of Connecticut and University of Connecticut
Keywords:	gene expression ; missing data ; model-based clustering ; multiple imputation
Abstract:	Model-based clustering using Normal mixture models provides a framework to describe how data groups together using Normal distributions. However, the existing methods for such analyses require complete data. One way to handle incomplete data is multiple imputation, a simulation-based approach which bypasses many of the disadvantages present in other methods for handling incomplete data. However, it is difficult to apply multiple imputation and cluster analysis in a straightforward manner. In this paper, we develop a new methodology for clustering incomplete data. We have added clustering methods to particular steps in multiple imputation in order to create a way to cluster incomplete data. We illustrate how our new method outperforms existing methodology with a simulation study using Fisher's Iris dataset, then demonstrate the utility of the method on yeast gene expression data.

Authors who are presenting talks have a * after their name.

2014 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Professional Development program, please contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.