Online Program Home
  My Program

Abstract Details

Activity Number: 109 - Learning from External Covariates in High-Dimensional Genomic Data Analysis
Type: Topic Contributed
Date/Time: Monday, July 31, 2017 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistics in Genomics and Genetics
Abstract #322794
Title: Data-Driven Penalization for High-Dimensional Regression and Classification Using External Covariates
Author(s): Britta Velten* and Wolfgang Huber
Companies: EMBL Heidelberg and EMBL Heidelberg
Keywords: regression ; high-dimensional ; Bayesian ; multi-omic ; co-data
Abstract:

Penalization schemes like LASSO or Ridge regression are routinely used to regress a response of interest on a high-dimensional set of features. Commonly used approaches assume that features are exchangeable: the same penalty factor is used for each model coefficient. In many applications, however, additional information is available about the features. Such information can include structural knowledge (e.g., feature sets comprising multiple data types and data qualities, such as in biology: transcriptome, genome, epigenome) and/or different prior probabilities for different feature classes (e.g., based on gene or pathway annotation or prior studies). We present a hierarchical Bayesian model that enables differentially penalizing groups of features based on external covariates and adapts the penalty to the information content of each group in a data-driven way. In an application to drug response prediction for cancer patients from multiple 'omic data types, the method identifies meaningful differences between 'omic data types. Using available covariates extends the range of applications of penalized regression, improves model interpretability and can improve prediction performance.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2017 program

 
 
Copyright © American Statistical Association