Abstract:
|
It is known that genetic liability is the single largest contributor to phenotypic variation. However, predicting complex disease risk based on whole-genome data is methodologically challenging due to limited detection power for individual signal variants that have small effect sizes. Existing methods typically explain only a small fraction of trait variance. In this paper, we propose a new dimension reduction method based on false negative control to facilitate powerful polygenic prediction. By utilizing high-quality, out-of-sample GWAS summary statistics, we are able to effectively remove variants that are not functionally relevant and transfer the knowledge to perform more efficient joint modeling using the target data. Although the target data with individual genotype and phenotype measures may have a limited sample size, the proposed method, using both the target data and the out-of-sample summary statistics, can facilitate more powerful and accurate polygenic prediction.
|