Abstract:
|
Expression Quantitative Trait Loci (eQTL) detection is the task of identifying associations between gene expression levels and DNA variation. For RNA-Seq expression measurements, raw read counts are highly non-normal, and the rank-inverse normal quantile transformation has commonly been used to ensure accuracy of p-values. However, this quantile normalization results in estimates of allelic effect sizes that are biologically uninterpretable. Log transformations provide interpretable estimates, but linear regression for log-transformed expression conflicts with a model of independent allele-specific contributions to mean expression. In this paper, we introduce a non-linear model for expression as a function of SNP genotype that respects a reasonable biological model, and provides accurate p-values and interpretable estimates of the effect size. We propose a fast iterative algorithm for fitting the model and apply it to a massive data set provided by the Genotype Tissue Expression (GTEx) consortium.
|