Abstract:
|
At the species level, phylogenetic trees are essential to understanding the evolution process and relationships among species, and thus to the interpretation of biological information in a collection of DNA sequence data. As the typical attributes of a species tree, branch lengths,or speciation times, can help to date the formation of the species. Here we derive an estimator for branch lengths under the multi-species coalescent model and the Jukes-Cantor model, assuming the molecular clock. Our proposed estimator is distinct from the existing estimators, in that it uses pseudolikelihood to obtain computationally efficient estimators in a model-based framework. The method incorporates all of the variability in the data without losing computational efficiency. We show that this estimator is statistically consistent and asymptotically normally distributed. We find that use of the nonparametric bootstrap provides a more accurate estimate of the variance of the estimates than theoretical asymptotic variance. The performance and computational cost associated with our method of speciation time estimation is assessed using simulated datasets and a genome-scale dataset for gibbons.
|