Activity Number:
|
426
- SPEED: Biopharmaceutical and General Health Studies: Statistical Methods and Applications, Part 2
|
Type:
|
Contributed
|
Date/Time:
|
Tuesday, July 30, 2019 : 3:05 PM to 3:50 PM
|
Sponsor:
|
Health Policy Statistics Section
|
Abstract #307842
|
|
Title:
|
Clustering of Multivariate Data with Varying Dimensions
|
Author(s):
|
Xiaoqi Lu* and Bin Cheng and Ying Kuen Ken Cheung
|
Companies:
|
Columbia University and Columbia University and Columbia University
|
Keywords:
|
clustering;
healthcare;
zero-inflation;
hierarchical
|
Abstract:
|
Clustering is a common unsupervised learning method that helps to reveal hidden structures in data by grouping similar objects. However, such method has not been widely used in healthcare data whose distributions are usually zero-inflated. We proposed a parametric modeling approach with a two-layer hierarchical structure: the first layer models the zero-inflation pattern, while the second layer models the conditional distribution of the positive entries. Parameters are estimated by a regularized maximum likelihood estimation (MLE), using expectation-maximization (EM) algorithm.
|
Authors who are presenting talks have a * after their name.