Name: 2018 Joint Statistical Meetings
Start: 2018-07-28T07:00:00+00:00
End: 2018-08-02
Location: Vancouver Convention Centre

Abstract Details

Activity Number:	310 - Topics of Variable Selection
Type:	Contributed
Date/Time:	Tuesday, July 31, 2018 : 8:30 AM to 10:20 AM
Sponsor:	Section on Statistical Learning and Data Science
Abstract #330260	Presentation
Title:	Budget-Constrained Feature Selection for Binary Classification: a Neyman-Pearson Approach
Author(s):	Yiling Chen* and Xin Tong and Jingyi Li
Companies:	University of California, Los Angeles and University of Southern California and University of California, Los Angeles
Keywords:	feature selection; disease diagnosis; type 1 error; false positive control; Neyman-Pearson; machine learning
Abstract:	In biomedical applications such as cancer diagnosis, binary classification often requires asymmetric misclassification error control, because misclassifying a diseased patient as healthy vs. misclassifying a healthy patient as diseased would result in severely different consequences. Previously, we proposed the Neyman-Pearson (NP) classification paradigm to address such asymmetric classification problems. An important unsolved question is what features are more important under the NP paradigm. Here we propose NP-Rank, a method that ranks features based on their type II errors (the less severe type of misclassification error) with their type I errors (the more severe type of error) controlled under a user-specified threshold (such as 0.05) with high probability. NP-Rank has desirable theoretical guarantees when used with density plug-in classifiers. Extensive numerical studies show that NP-Rank, used with popular classification methods such as Logistic regression, outperforms traditional ranking methods under the classical paradigm. A real data application on DNA methylation profiles from breast cancer patients further demonstrates the advantages of NP-Rank.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program

JSM 2018 Online Program

Abstract Details

American Statistical Association