Abstract:
|
In recent years, survey practitioners have been taking advantage of advances in machine learning such as classification/regression tree algorithms, which do not require a functional form for the relationship between the outcome and the covariates, and determine which covariates to include based on adaptive rules. In particular, response propensity modeling is a common application of such methodologies. Despite the usefulness and popularity of these approaches, most classification algorithms in use today ignore the sampling design in their fitting criteria. We propose an extension of the popular Chi-square Automatic Interaction Detector (CHAID) approach that accounts for the design by applying a Rao-Scott correction in its splitting criterion. We discuss the statistical properties of the resulting algorithm and compare it to existing algorithms, including rpms, in the context of nonresponse adjustment of a number of surveys.
|