Abstract:
|
We examine the problem of constructing confidence intervals for nonparametric regression---that is, constructing confidence intervals for E[Y|X], the conditional mean of Y given observed features X. If the distribution of the data is unknown, we may prefer not to place any assumptions on the true regression function, and would like to construct confidence intervals that are "distribution-free". Our results show that, if the distribution of the feature vector X is nonatomic (no point masses), then there is a fundamental lower bound on the length of any distribution-free confidence interval---and in particular, this length does not vanish with sample size. On the other hand, if the feature vector X has a discrete distribution, then even if we rarely observe a repeated value, meaningful inference may still be possible.
|