Abstract:
|
Informative selection, in which the distribution of response variables given that they are sampled is different from their distribution in the population, is pervasive in complex surveys. Failing to take such informativeness into account can produce severe inferential errors, including biased and inconsistent estimation of population parameters. While several parametric procedures exist to test for informative selection, these methods are limited in scope and their parametric assumptions are difficult to assess. Motivated by a kernel-based learning method that compares distributions based on their maximum mean discrepancy, we develop a class of nonparametric tests for informative selection that compares weighted and unweighted distributions. The asymptotic distributions of the test statistics are established under the null hypothesis of noninformative selection. Simulation results show that our tests have power competitive with existing parametric tests in a correctly specified parametric setting, and better than those tests under model misspecification. Our approach adapts automatically to multidimensional settings. A recreational angling application illustrates the methodology.
|