Abstract:
|
Evolve and re-sequence studies provide a popular approach to simulate evolution in the lab and explore its genetic basis. In this context, the chi-square test, as well as the Cochran-Mantel-Haenszel test, are popular to infer genomic positions affected by selection from temporal changes in allele frequency. However, the null model associated with these tests does not match the null hypothesis of actual interest. Indeed due to genetic drift and other noise components, the null variance in the data can be substantially larger than accounted for. This leads to a huge number of false-positive results. Even, if the ranking rather than the actual p-values are of interest, a naive application of the mentioned tests will give misleading results, as the amount of over-dispersion varies from locus to locus. We, therefore, propose easy to compute test statistics that adjust for the over-dispersion. This is particularly useful in genome-wide applications involving millions of SNPs. If estimates of the null variance are available, the obtained formulas may be useful in other applications too.
|