Abstract:

Given values of a covariate X, suppose we observe values of a response Y from one of several nonparametric datagenerating regimes: a mean response function mu_1 plus noise, another mean response function mu_2 plus noise, and so forth. Suppose that mu_1, mu_2, etc., are known but that we are not certain which one is generating the values of Y. Two questions arise. First, how can we infer the datagenerating mechanism? Second, if we can choose the values of X, how shall we do so? One possible approach is to space the values of X along a grid, then use the values of Y to nonparametrically estimate the datagenerating mechanism, and finally compare the estimated datagenerating regime to mu_1, mu_2, etc. In this work, however, we show that the datagenerating mechanism can be inferred without nonparametric estimation, such that the risk of misclassification decays at an exponential rate with respect to the sample size. We discuss the implications for addressing an inverse problem such as ascertaining nanoparticle properties from scattering data.
