Online Program Home
My Program

Abstract Details

Activity Number: 499
Type: Contributed
Date/Time: Wednesday, August 3, 2016 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistical Learning and Data Science
Abstract #318982
Title: Incorporating Biological Information in Sparse Principal Component Analysis with Application to Genomic Data
Author(s): Ziyi Li* and Qi Long and Sandra Safo
Companies: Emory University and Emory University and Emory University
Keywords: Biological information ; high dimension, low sample size ; principal component analysis ; sparsity

The advances in technology have lead to the collection of high dimensional data such as genomic data. Before applying the existing statistical methods on high dimensional data, principal component analysis (PCA) is often used to reduce dimensionality. Sparse PC loadings are usually desired in this situation for simplicity and better interpretation. Although PCA has been extended to produce sparse PC loadings, few methods take potential biological information into consideration. In this article, we propose two novel structured sparse PCA methods which not only have sparse solutions but also incorporate available biological information. Our simulation study demonstrates incorporating known biological information improves the performance of sparse PCA methods, and the proposed methods are robust to potential misspecification of the biological information. We further illustrate the performance of our methods in a Glioblastoma genomic data set.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

Copyright © American Statistical Association