JSM 2012 Home

JSM 2012 Online Program

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

Online Program Home

Abstract Details

Activity Number: 613
Type: Contributed
Date/Time: Thursday, August 2, 2012 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistical Learning and Data Mining
Abstract - #304971
Title: Smoothed Stability Selection for Analysis of Sequencing Data
Author(s): Eugene Urrutia*+ and Li Yun and Michael C Wu
Companies: The University of North Carolina at Chapel Hill and The University of North Carolina at Chapel Hill and The University of North Carolina at Chapel Hill
Address: 1616 S Mineral Springs Rd, Durham, NC, 27703, United States
Keywords: variable selection ; smooth stability selection ; next generation sequencing ; rare variants ; high dimensional data ; resampling methods

High dimensional data are increasingly common. Difficulties in model interpretation and limited power to detect effects have led to the use of variable selection methods, including penalized regression methods such as the LASSO. Recent advances in the variable selection literature suggest that resampling strategies which include stability selection, complementary stability selection, and bolasso, can offer improvements over the LASSO and non-resampling based approaches. We show that common resampling based methods can be recast as LASSO with reweighted observations where weights are discrete and identically distributed from a specified distribution. Sequencing data presents an additional challenge in that many predictors are rare (minor alleles observed in only a few individuals) and have a high probability of exclusion in the resampling schemes. Thus, we have developed the smooth stability selection procedure where we replace the discrete weights with continuous weights, and thus avoid excluding rare predictors. Simulation results and real data analyses suggest that our proposed method increases power to select rare variables while retaining type I error control.

The address information is for the authors that have a + after their name.
Authors who are presenting talks have a * after their name.

Back to the full JSM 2012 program

2012 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Continuing Education program, please contact the Education Department.