Abstract:
|
Specification of appropriate evolutionary models is a crucial step in phylogenetic inference. Evolutionary model selection has traditionally been accomplished by first partitioning a multiple sequence alignment and then estimating the best-fitting evolutionary model for each partition. However, this can be a difficult and time-consuming approach with many limitations. Automatic model selection approaches can address such issues as well as yield better fitting models by simultaneously estimating the number of partitions, assignment of sites to partitions, and the substitution model for each partition. Recent progress has been made in this direction through the development of a Dirichlet process model, and we take this development a step further by employing a generalized Polya urn process. A generalized Polya urn process includes a large number of countable mixture models as special cases, and we examine the effectiveness of different mixture models in improving evolutionary model fit. We also develop Markov chain Monte Carlo (MCMC) algorithms that exploit data squashing techniques to reduce numerical booking and can efficiently explore the large phylogenetic model space.
|