Online Program Home
My Program

Abstract Details

Activity Number: 448
Type: Contributed
Date/Time: Tuesday, August 2, 2016 : 2:00 PM to 2:45 PM
Sponsor: Biometrics Section
Abstract #321564
Title: Supervised Integrated Principal Component Analysis
Author(s): Gen Li* and Sungkyu Jung
Companies: Columbia University and University of Pittsburgh
Keywords: Dimension Reduction ; Data Integration ; Multi-Source Data ; Supervised Integrated PCA

It is increasingly common to collect heterogeneous data sets from multiple sources for a common set of subjects in modern biomedical research. However, data from different sources may be heterogeneous and each data set may be high dimensional, making it impractical to carry out analysis (such as predictive modeling or clustering) directly using the original data. Dimension reduction can reduce the magnitude and complexity of data, and integrate multiple data into the same space. In this paper, we introduce Supervised Integrated Principal Component Analysis (SIPCA), a new computational tool for integration and reduction of multi-source data. The method explicitly captures joint and individual structures across multiple primary data sources. Moreover, when there are auxiliary data driving the underlying structures, SIPCA specifically accounts for the auxiliary information through a latent variable model. It substantially improves interpretability of reduced data over existing dimension reduction methods. We demonstrate the advantage of SIPCA using a multi-tissue genetic study and a pediatric growth study.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

Copyright © American Statistical Association