Online Program Home
My Program

Abstract Details

Activity Number: 605
Type: Contributed
Date/Time: Wednesday, August 3, 2016 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistical Education
Abstract #319122 View Presentation
Title: Propensity Score Matching Using Random Forest in Educational Data Mining Problems
Author(s): Richard Levine*
Companies: San Diego State University
Keywords: observational study ; proximity ; learning analytics ; student success study

In order to draw unbiased inferences under observational/quasi-experimental study designs common in educational research, matching methods are often applied to produce balanced treatment and control groups in terms of all background variables. Though propensity scores have been a key component in such applications, propensity score based matching methods are limited by model misspecifications, categorical variables with more than two levels, missing data, and nonlinear relationships. Random forest, averaging outcomes from many decision trees, is nonparametric in nature, straightforward to use, and capable of solving these issues. More importantly, the precision afforded by random forest may provide a more accurate and less model dependent estimate of the propensity score. The proximity matrix, a by-product of the random forest, is also shown as a natural distance measure between observations that may be used in matching. The proposed random forest based matching methods are illustrated on a student success study evaluating the efficacy of a supplemental instruction section in a large enrollment, bottleneck introductory statistics course.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

Copyright © American Statistical Association