Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 165 - SLDS CSpeed 2
Type: Contributed
Date/Time: Tuesday, August 10, 2021 : 10:00 AM to 11:50 AM
Sponsor: Section on Statistical Learning and Data Science
Abstract #318880
Title: Adapting Random Forests for Use with Sample Survey Data
Author(s): Kevin Fenton* and Robert Petrin and Peter Szczesny
Companies: Ipsos Public Affairs and Ipsos Public Affairs and Ipsos Public Affairs
Keywords: Random Forest; Survey; Sample
Abstract:

Random forest methods (RF) and other tree-based methods have become an integral part of social and behavioral research. There is currently limited guidance in their use with data collected for complex sampling designs. As a result, RF applications which apply these methods to sample survey data (including, for example, imputation, behavioral prediction, factor comparisons, and cross-method model validation) may be overly biased. The purpose of this paper is to evaluate methods for overcoming this limitation, and improving the performance of RF via methods we refer to as knotted branches, weighted bags, and post-hoc adjustments. Performance will be based on a large, but highly realistic, synthetic population covering health risks and related behaviors in the United States.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program