Abstract:
|
Increasing reluctance to participate in surveys leads to low response rates-as low as 9% in telephone surveys-increasing the potential for nonresponse bias. When auxiliary information is available on both respondents and nonrespondents, one common method for correcting nonresponse bias is to model response propensity, typically using a logit or probit with a linear link. Recent developments in machine learning, allow for flexible functional form estimation and variable selection. We examine whether current machine learning techniques can help reduce nonresponse bias in surveys, especially when response is a more complicated function of covariates than typically implemented. We apply these techniques to the German panel study Labour Market and Social Security. We compare traditional techniques (e.g., raking, post-stratification, logistic regression) with machine learning techniques (e.g., classification trees, random forests, neural nets, adaptive LASSO with a polynomial expansion of regressors).
|