Abstract:
|
Recently model-X methods such as model-X knockoffs and the conditional randomization test have been shown to provide an alternative paradigm for principled high-dimensional inference by shifting the burden of knowledge from modeling the response (Y) given the covariates (X) to modeling just the covariates. Model-X methods provide exact inferential guarantees for any choice of test statistic, including those derived from arbitrarily sophisticated machine learning algorithms, suggesting the potential for significant power gains in complex domains. We present a detailed theoretical analysis of the power of model-X methods, comparing the power of these methods to the optimal achievable power and to the power of canonical methods that condition on X, and repeating this comparison in low dimensions, medium dimensions (n/p bounded away from zero and infinity), and high dimensions. One operational output of our theory is guidance on the most powerful choice of statistic for model-X methods.
|