Abstract:
|
Model-X knockoffs is a new tool for controlling the false discovery rate in very complex statistical models; in fact, they may be so complex that they can be treated as black boxes. To leverage the full power of this framework, however, we need flexible tools to construct knockoffs from sampled data. This talk presents a practical knockoff sampling machine that can produce knockoffs for arbitrary and unspecified data distributions, using deep generative models. The main idea is to iteratively refine a knockoff sampling mechanism until a criterion measuring the validity of the knockoffs we produce is minimized. This criterion measures a distance to pairwise exchangeability and is inspired by a popular maximum mean discrepancy measures in machine learning. Numerical experiments and quantitative tests indicate that our approach yields reasonable knockoffs. Our methods yield a model-free approach, and we present an application to the study of mutations linked to changes in drug resistance in the human immunodeficiency virus.
|