Abstract:
|
Take a random $d$-vector $Z$ that has a Lebesgue density and so that $E Z =0$ and $E Z Z' = I_d$. Consider two projections defined by unit-vectors $\alpha$ and $\beta$, namely a response $y = \alpha' Z$ and an explanatory variable $x = \beta' Z$. Under regularity conditions, Leeb has shown (2013, AoS) that for most $\beta$'s, $E[y|x]\approx$ linear in $x$, and that $Var[y|x]\approx$ const in $x$, provided that $d$ is large. These results imply that most simple submodels of a high-dimensional linear model are approximately correct. But Leeb's results are asymptotic, as $d\to \infty$ and no explicit bounds have been established. We provide explicit, finite-$d$ error bounds for the results regarding the conditional expectation. For a fixed $d$, let $E_d$ be the set of $\beta$'s such that $E[y|x]\approx \mbox{linear in } x$. We find bounds on the size of $E_d$ (w.r.t. the uniform distribution $\upsilon$ on the unit $d$-sphere) and we show that its size increases very fast. Namely, $\upsilon(E_d)\to 1,$ as $d\to \infty$, at a rate faster than any polynomial rate. Our current research suggests that similar results can be expected for the conditional variances.
|