Abstract:
|
Large-scale genomic profiling studies such as The Cancer Genome Atlas (TCGA) project have generated multi-type genomic data (e.g., gene expression, DNA copy number, mutation, etc.) for individual tumor samples. In an effort to identify clinically relevant tumor subtypes, we recently developed a latent variable model (Mo et al., 2013, PNAS 110(11):4245-50) that can jointly model four different data types including binary, categorical, count and continuous data to cluster tumor samples based on their joint profiling patterns. The method combines Bayesian and frequentist approaches in order to achieve genomic feature selection and joint dimension reduction of complex genomic data to a few eigen features that can be used for integrated visualization and cluster discovery. The major limitations of the method are the lack of statistical inference for the selected genomic features and the need of grid search for model parameters. In order to overcome the limitations, we have developed a full Bayesian model that uses Bayesian variable selection approach to select genomic features that contribute to tumor sample clustering. We will use cancer genomic data to illustrate the new approach.
|
ASA Meetings Department
732 North Washington Street, Alexandria, VA 22314
(703) 684-1221 • meetings@amstat.org
Copyright © American Statistical Association.