Abstract:
|
Accurate genetic prediction of complex traits requires the development of polygenic methods to model all SNPs jointly. Previous polygenic methods make parametric assumptions on the SNP effect size distribution. However, depending on how well the assumed effect size distribution matches the unknown truth, different polygenic methods can perform well for different traits. To enable robust phenotype prediction across a range of traits, we develop a novel polygenic model with a flexible assumption on the effect size distribution. We refer to our model as the latent Dirichlet Process Regression (DPR). DPR relies on the Dirichlet process to assign a prior on the effect size distribution itself, is non-parametric in nature, and is capable of inferring the effect size distribution from the data at hand. Because of the flexible modeling assumption, DPR is able to adapt to a broad spectrum of genetic architectures and achieves robust predictive performance for a variety of complex traits. We illustrate the benefits of DPR by applying it to predict gene expressions using cis-SNPs, to conduct PrediXcan based gene set test, and to perform genomic selection of five traits across three species.
|