Abstract:
|
In genetical genomics studies, it is desirable to use genetic variants as instruments to estimate the causal effects of gene expressions on a clinical trait. The high dimensionality and pleiotropic effects of genetic variants bring about new challenges in choosing relevant and valid instruments. Existing methods for dealing with high-dimensional instruments either require unrealistic assumptions on the validity of instruments or can only handle a single exposure variable. To address these issues, in an instrumental variable model with high-dimensional, possibly invalid instruments and multiple exposures, we present novel identifiability conditions and propose a two-stage regularization method for estimating the causal effects. We further develop inferential procedures for obtaining debiased estimates and constructing confidence intervals. The proposed method is efficiently implemented and theoretical guarantees are provided. The usefulness of our method is demonstrated on simulated and real data.
|