JSM Preliminary Online Program
This is the preliminary program for the 2006 Joint Statistical Meetings in Seattle, Washington.

The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


Back to main JSM 2006 Program page




Activity Number: 243
Type: Contributed
Date/Time: Tuesday, August 8, 2006 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistical Computing
Abstract - #306470
Title: An Investigation of Missing Data Methods for Decision Trees
Author(s): Yufeng Ding*+ and Jeffrey Simonoff
Companies: New York University and New York University
Address: 44 W. 4th Street, New York, NY, 10012,
Keywords: missing data ; decision tree ; probabilistic split ; surrogate split ; CART ; c4.5
Abstract:

There are many different missing data methods used by decision tree algorithms, but not many studies have been done on their appropriateness and performance. This paper provides both analytic and Monte Carlo evidence regarding the effectiveness of six popular missing data methods. We show that in the context of decision trees, the relationship between the missingness and the dependent variable, rather than the standard missingness classification approach of Rubin (missing completely at random, missing at random, and not missing at random), is the most helpful criterion to distinguish different missing data methods. We are also able to make recommendations as to the best method to use in various situations.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2006 program

JSM 2006 For information, contact jsm@amstat.org or phone (888) 231-3473. If you have questions about the Continuing Education program, please contact the Education Department.
Revised April, 2006