Online Program Home
My Program

Abstract Details

Activity Number: 501
Type: Contributed
Date/Time: Wednesday, August 3, 2016 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistics in Genomics and Genetics
Abstract #319540 View Presentation
Title: A Dirichlet Process Mixture Model for Clustering Longitudinal Gene Expression Data
Author(s): Jiehuan Sun* and Jose D. Herazo-Maya and Naftali Kaminski and Hongyu Zhao and Joshua Warren
Companies: Yale University and Yale School of Medicine and Yale School of Medicine and Yale University and Yale University
Keywords: Bayesian factor analysis ; Bayesian nonparametrics ; Clustering ; Dirichlet process ; Longitudinal gene expression study
Abstract:

Gene expression profiles have been widely utilized to define subgroups of diseased patients, which often lead to novel biological insights on diseases. Longitudinal gene expression profiles collected from patients over time might provide additional information on disease progression than what is captured by baseline profiles alone. Therefore, subgroup identification could be more accurate with the aid of longitudinal gene expression data. However, existing statistical methods are unable to fully utilize these data for patient clustering. In this article, we introduce a novel subgroup identification method in the Bayesian setting based on longitudinal gene expression profiles. This method, called BClustLonG, adopts a linear mixed-effects model framework to model the trajectory of genes over time and the Dirichlet process prior distribution is assumed for the random effects to induce clustering. Also, factor analysis model is used for the regression coefficients to account for the correlations among genes and alleviate the curse of dimensionality. Through extensive simulation studies and real data analysis, we show that BClustLonG has improved performance over an empirical method.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

 
 
Copyright © American Statistical Association