Abstract:
|
We propose a new method to infer phylogenetic networks more efficiently. Phylogenetic networks describe the evolutionary history of species, languages, or individuals. They model vertical inheritance while simultaneously allowing for horizontal information passing from species hybridization, horizontal gene transfer, or word sharing between languages. Phylogenetic networks are distinct from other network problems because they have leaf data but lack information about interior nodes. To infer network structure using only tip data, we optimize the network’s Markov substitution model-based likelihood. Under current coalescent models, likelihood-based calculations are exponential in time and therefore prohibitively expensive. However, the benefits of likelihood-based network inference are numerous, with fewer non-identifiability issues, more precise inference, and more accurate comparisons between networks. To improve efficiency, we simplify the coalescent model while retaining key elements known to be important in real data and implement a fast divide-and-conquer type algorithm. We describe the improvement over various existing methods.
|