Conference Program

Return to main conference page

All Times ET

Thursday, June 9
Machine Learning
Data Mining and Deep Learning
Thu, Jun 9, 10:30 AM - 12:00 PM

Random Forest to Estimate a Dose-Response Relationship in Quasi-Experimental Student Success Studies (310070)

Juanjuan Fan, San Diego State University 
*Richard A Levine, San Diego State University 
Justin Thorp, San Diego State University 

Keywords: causal inference, generalized propensity score, educational data mining

We develop a random forest of interaction tree method for ordered and continuous treatments to identify an individualized optimal treatment regime. Our motivation is evaluating the impact of a STEM tutoring center and a Supplemental Instruction program. Program advisors are particularly interested in estimating an individualized treatment effect to advise students on when and how often to attend either of these programs to maximize success in specific STEM courses. The approach proposes novel tree growing procedures incorporating generalized propensity scores. In this way we can estimate average and individualized treatment effects over treatment dose in any form (binary treat/no treat, categorical, ordered, and continuous). We illustrate the efficacy of the method and show its superior performance to the current state-of-the-art in an extensive simulation study. We also show how the random forest can draw causal inferences in a study of student success in a large enrollment introductory Chemistry course. An R package is under development integrating C++ for fast implementation.