The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.
Online Program Home
Abstract Details
Activity Number:
|
613
|
Type:
|
Contributed
|
Date/Time:
|
Thursday, August 2, 2012 : 8:30 AM to 10:20 AM
|
Sponsor:
|
Section on Statistical Learning and Data Mining
|
Abstract - #304971 |
Title:
|
Smoothed Stability Selection for Analysis of Sequencing Data
|
Author(s):
|
Eugene Urrutia*+ and Li Yun and Michael C Wu
|
Companies:
|
The University of North Carolina at Chapel Hill and The University of North Carolina at Chapel Hill and The University of North Carolina at Chapel Hill
|
Address:
|
1616 S Mineral Springs Rd, Durham, NC, 27703, United States
|
Keywords:
|
variable selection ;
smooth stability selection ;
next generation sequencing ;
rare variants ;
high dimensional data ;
resampling methods
|
Abstract:
|
High dimensional data are increasingly common. Difficulties in model interpretation and limited power to detect effects have led to the use of variable selection methods, including penalized regression methods such as the LASSO. Recent advances in the variable selection literature suggest that resampling strategies which include stability selection, complementary stability selection, and bolasso, can offer improvements over the LASSO and non-resampling based approaches. We show that common resampling based methods can be recast as LASSO with reweighted observations where weights are discrete and identically distributed from a specified distribution. Sequencing data presents an additional challenge in that many predictors are rare (minor alleles observed in only a few individuals) and have a high probability of exclusion in the resampling schemes. Thus, we have developed the smooth stability selection procedure where we replace the discrete weights with continuous weights, and thus avoid excluding rare predictors. Simulation results and real data analyses suggest that our proposed method increases power to select rare variables while retaining type I error control.
|
The address information is for the authors that have a + after their name.
Authors who are presenting talks have a * after their name.
Back to the full JSM 2012 program
|
2012 JSM Online Program Home
For information, contact jsm@amstat.org or phone (888) 231-3473.
If you have questions about the Continuing Education program, please contact the Education Department.