Online Program Home
My Program

Abstract Details

Activity Number: 339 - SPEED: Biopharmaceutical and General Health Studies: Statistical Methods and Applications, Part 1
Type: Contributed
Date/Time: Tuesday, July 30, 2019 : 10:30 AM to 12:20 PM
Sponsor: Health Policy Statistics Section
Abstract #306499 Presentation
Title: Clustering of Multivariate Data with Varying Dimensions
Author(s): Xiaoqi Lu* and Bin Cheng and Ying Kuen Ken Cheung
Companies: Columbia University and Columbia University and Columbia University
Keywords: clustering; healthcare; zero-inflation; hierarchical

Clustering is a common unsupervised learning method that helps to reveal hidden structures in data by grouping similar objects. However, such method has not been widely used in healthcare data whose distributions are usually zero-inflated. We proposed a parametric modeling approach with a two-layer hierarchical structure: the first layer models the zero-inflation pattern, while the second layer models the conditional distribution of the positive entries. Parameters are estimated by a regularized maximum likelihood estimation (MLE), using expectation-maximization (EM) algorithm.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program