Abstract:
|
Studies have shown that gene expressions and genetic variants are closely associated with common human diseases. However, the relations between gene expressions and phenotypes are impacted by potential unmeasured confounders from multiple resources. With the fact that the dimensions could be larger than the sample size, a high dimensional sparse instrumental variables model is used to explore such associations. It is interesting to consider hypothesis testing problem with the control of false discovery rate (FDR) for this model. We develop a multiple testing procedure for the sparse instrumental variables model, which extends the standard multiple testing procedure for high dimensional linear model. Test statistic for each coefficient is constructed based on inverse regressions and a threshold is calculated with the consideration of the correlation among those statistics. Theoretical results indicate the proposed method could successfully control the FDR. In addition, a group of simulations are conducted to evaluate the performance of our proposed method. We also apply it to a Yeast dataset to identify genes that are associated with the phenotype.
|