Abstract Details
Activity Number:
|
34
|
Type:
|
Contributed
|
Date/Time:
|
Sunday, August 3, 2014 : 2:00 PM to 3:50 PM
|
Sponsor:
|
Health Policy Statistics Section
|
Abstract #312555
|
View Presentation
|
Title:
|
Clustering Incomplete Data Using Normal Mixture Models
|
Author(s):
|
Chantal Larose*+ and Dipak Dey and Ofer Harel
|
Companies:
|
University of Connecticut and University of Connecticut and University of Connecticut
|
Keywords:
|
gene expression ;
missing data ;
model-based clustering ;
multiple imputation
|
Abstract:
|
Model-based clustering using Normal mixture models provides a framework to describe how data groups together using Normal distributions. However, the existing methods for such analyses require complete data. One way to handle incomplete data is multiple imputation, a simulation-based approach which bypasses many of the disadvantages present in other methods for handling incomplete data. However, it is difficult to apply multiple imputation and cluster analysis in a straightforward manner.
In this paper, we develop a new methodology for clustering incomplete data. We have added clustering methods to particular steps in multiple imputation in order to create a way to cluster incomplete data. We illustrate how our new method outperforms existing methodology with a simulation study using Fisher's Iris dataset, then demonstrate the utility of the method on yeast gene expression data.
|
Authors who are presenting talks have a * after their name.
Back to the full JSM 2014 program
|
2014 JSM Online Program Home
For information, contact jsm@amstat.org or phone (888) 231-3473.
If you have questions about the Professional Development program, please contact the Education Department.
The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.
Copyright © American Statistical Association.