JSM 2014 Home
Online Program Home
My Program

Abstract Details

Activity Number: 509
Type: Contributed
Date/Time: Wednesday, August 6, 2014 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistical Learning and Data Mining
Abstract #312032 View Presentation
Title: The Art of Balancing
Author(s): Zhen Zhang*+ and Justin Croft and Kendell Churchwell
Companies: C Spire Wireless and C Spire Wireless and C Spire Wireless
Keywords: skewed ; algorithm ; modeling ; classifier ; class imbalance ; target accuracy
Abstract:

Data mining technology has been widely used in strategic marketing to uncover actionable information for a wide spectrum of critical marketing decisions. A common problem in many data mining applications is that data is often skewed, and skewed data often leads to degenerated algorithms that assign most or all cases to the most common outcome. For the modeling projects that have extremely skewed targets, such as churn prediction or fraud detection, data balancing techniques applied prior to modeling process are crucial steps to ensure a useful model. As modeling cases are domain and algorithm sensitive, there is no one-size-fit-all solution for the right balancing strategy. In this paper we present empirical guidelines on balancing strategies for extremely skewed data with binary outcome. Best practices are suggested pertaining to decision trees, logistic regression algorithm and neural network models.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2014 program




2014 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Professional Development program, please contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

ASA Meetings Department  •  732 North Washington Street, Alexandria, VA 22314  •  (703) 684-1221  •  meetings@amstat.org
Copyright © American Statistical Association.