Abstract:
|
Identifying population structure from multilocus genotype data is key to downstream population genetic analyses in a variety of fields, including conservation, evolutionary genetics, Genome Wide Association Studies (GWAS), and pedigree reconstruction for quantitative genetics. There are both Bayesian and maximum likelihood approaches for inference of this model, but neither has scaled well with large datasets. We extend recent improvements in accelerated optimization routines for independent Binomial models to the Multinomial situation. We demonstrate striking speed improvements that find the global maximum quicker and permit computationally intensive analyses such as those useful for estimating the number of clusters K. We demonstrate that methods to estimate K are far more reliable using the sped-up maximum likelihood approach. Genetics is our motivating problem, but the model is generally applicable to mixtures of coordinate-wise independent Multinomials.
|
ASA Meetings Department
732 North Washington Street, Alexandria, VA 22314
(703) 684-1221 • meetings@amstat.org
Copyright © American Statistical Association.