|
Activity Number:
|
102
|
|
Type:
|
Contributed
|
|
Date/Time:
|
Monday, July 30, 2007 : 8:30 AM to 10:20 AM
|
|
Sponsor:
|
Biometrics Section
|
| Abstract - #310032 |
|
Title:
|
Sparse Partial Least Squares Regression with an Application to the Genome Scale Transcription Factor Activity Analysis
|
|
Author(s):
|
Hyonho Chun*+ and Sunduz Keles
|
|
Companies:
|
University of Wisconsin-Madison and University of Wisconsin-Madison
|
|
Address:
|
1300 University Avenue, Madison, WI, 53706,
|
|
Keywords:
|
SPLS ; variable selection ; Gene expression ; genome-wide binding data
|
|
Abstract:
|
Partial least squares (PLS) has been used in analysis of modern biological data which involves high-dimensionality and multicollinearity. However, PLS is not particularly tailored for variable selection, and this could be problematic when majority of the variables are noise. We show inconsistency of PLS in the presence of large number of noise variables. We propose a sparse partial least squares (SPLS) which aims to simultaneously achieve good predictive performance and variable selection thereby producing sparse linear combinations of the original predictors. We formulate SPLS by imposing L1 penalty, and show that simple soft thresholding is the solution for univariate response. We investigate the performance of SPLS by simulation study and apply SPLS to the problem of inferring transcription factor activity by integrating gene expression microarray data and genome-wide binding data.
|