JSM 2015 Preliminary Program

Online Program Home
My Program

Abstract Details

Activity Number: 655
Type: Contributed
Date/Time: Thursday, August 13, 2015 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistical Learning and Data Mining
Abstract #316580
Title: Binormal Precision-Recall Curves for Optimizing Classification of Imbalanced Data
Author(s): Zhongkai Liu* and Howard Bondell
Companies: North Carolina State University and North Carolina State University
Keywords: Binary classification ; Imbalanced data ; ROC curve ; Precision-Recall curve ; Binormal assumption
Abstract:

Conducting the binary classification on imbalanced data, i.e. a large skew in the class distribution, is a challenging problem. Those classifiers that are based on the receiver operating characteristic (ROC) curve have been regarded as the golden standard in binary classification. However, in front of imbalanced data, the ROC curve tends to give an overly optimistic view. Realizing its disadvantages of dealing with imbalanced data, we propose a Precision-Recall (PR) curve based approach with a binormal assumption, where the key idea is to estimate the classifier that maximizes the area under the binormal Precision-Recall curve. The asymptotic distribution of the estimate is shown, and simulation as well as real data results indicate that the binormal Precision-Recall method outperforms approaches based on the area under the ROC curve in terms of false discovery rate and asymptotic variance.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2015 program





For program information, contact the JSM Registration Department or phone (888) 231-3473.

For Professional Development information, contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

2015 JSM Online Program Home