Repository logo
English
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Tiếng Việt
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Yкраї́нська
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
Repository logo
  • Communities & Collections
  • All of R-3
English
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Tiếng Việt
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Yкраї́нська
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Browse by Author

Browsing by Author "Cao, Zhen"

Now showing 1 - 4 of 4
Results Per Page
Sort Options
  • Loading...
    Thumbnail Image
    Item
    Maximum Parsimony Inference of Phylogenetic Networks in the Presence of Polyploid Complexes
    (Oxford University Press, 2022) Yan, Zhi; Cao, Zhen; Liu, Yushu; Ogilvie, Huw A.; Nakhleh, Luay
    Phylogenetic networks provide a powerful framework for modeling and analyzing reticulate evolutionary histories. While polyploidy has been shown to be prevalent not only in plants but also in other groups of eukaryotic species, most work done thus far on phylogenetic network inference assumes diploid hybridization. These inference methods have been applied, with varying degrees of success, to data sets with polyploid species, even though polyploidy violates the mathematical assumptions underlying these methods. Statistical methods were developed recently for handling specific types of polyploids and so were parsimony methods that could handle polyploidy more generally yet while excluding processes such as incomplete lineage sorting. In this article, we introduce a new method for inferring most parsimonious phylogenetic networks on data that include polyploid species. Taking gene tree topologies as input, the method seeks a phylogenetic network that minimizes deep coalescences while accounting for polyploidy. We demonstrate the performance of the method on both simulated and biological data. The inference method as well as a method for evaluating evolutionary hypotheses in the form of phylogenetic networks are implemented and publicly available in the PhyloNet software package. [Incomplete lineage sorting; minimizing deep coalescences; multilabeled trees; multispecies network coalescent; phylogenetic networks; polyploidy.]
  • Loading...
    Thumbnail Image
    Item
    Phylogenomic assessment of the role of hybridization and introgression in trait evolution
    (Public Library of Science, 2021) Wang, Yaxuan; Cao, Zhen; Ogilvie, Huw A.; Nakhleh, Luay K.
    Trait evolution among a set of species—a central theme in evolutionary biology—has long been understood and analyzed with respect to a species tree. However, the field of phylogenomics, which has been propelled by advances in sequencing technologies, has ushered in the era of species/gene tree incongruence and, consequently, a more nuanced understanding of trait evolution. For a trait whose states are incongruent with the branching patterns in the species tree, the same state could have arisen independently in different species (homoplasy) or followed the branching patterns of gene trees, incongruent with the species tree (hemiplasy). Another evolutionary process whose extent and significance are better revealed by phylogenomic studies is gene flow between different species. In this work, we present a phylogenomic method for assessing the role of hybridization and introgression in the evolution of polymorphic or monomorphic binary traits. We apply the method to simulated evolutionary scenarios to demonstrate the interplay between the parameters of the evolutionary history and the role of introgression in a binary trait’s evolution (which we call xenoplasy). Very importantly, we demonstrate, including on a biological data set, that inferring a species tree and using it for trait evolution analysis in the presence of gene flow could lead to misleading hypotheses about trait evolution.
  • Loading...
    Thumbnail Image
    Item
    Polyphest: fast polyploid phylogeny estimation
    (Oxford University Press, 2024) Yan, Zhi; Cao, Zhen; Nakhleh, Luay
    Despite the widespread occurrence of polyploids across the Tree of Life, especially in the plant kingdom, very few computational methods have been developed to handle the specific complexities introduced by polyploids in phylogeny estimation. Furthermore, methods that are designed to account for polyploidy often disregard incomplete lineage sorting (ILS), a major source of heterogeneous gene histories, or are computationally very demanding. Therefore, there is a great need for efficient and robust methods to accurately reconstruct polyploid phylogenies.We introduce Polyphest (POLYploid PHylogeny ESTimation), a new method for efficiently and accurately inferring species phylogenies in the presence of both polyploidy and ILS. Polyphest bypasses the need for extensive network space searches by first generating a multilabeled tree based on gene trees, which is then converted into a (uniquely labeled) species phylogeny. We compare the performance of Polyphest to that of two polyploid phylogeny estimation methods, one of which does not account for ILS, namely PADRE, and another that accounts for ILS, namely MPAllopp. Polyphest is more accurate than PADRE and achieves comparable accuracy to MPAllopp, while being significantly faster. We also demonstrate the application of Polyphest to empirical data from the hexaploid bread wheat and confirm the allopolyploid origin of bread wheat along with the closest relatives for each of its subgenomes.Polyphest is available at https://github.com/NakhlehLab/Polyphest.
  • Loading...
    Thumbnail Image
    Item
    Towards more accurate phylogenetic network inference
    (2023-04-21) Cao, Zhen; Nakhleh, Luay; Ogilvie, Huw Alexander
    The multispecies network coalescent (MSNC) extends the multispecies coalescent (MSC) by modeling gene evolution within the branches of a phylogenetic network rather than a phylogenetic tree, which infers speciation events and reticulate evolutionary events by using the model of a phylogenetic network, taking the shape of a rooted, directed, acyclic graph. Existing methods for phylogenetic network inference were developed to account for reticulation and incomplete lineage sorting (ILS) simultaneously. While these methods demonstrate good accuracy on the inference of network topologies and continuous parameters in simple simulation settings, the accuracy can be easily affected in more complex scenarios, such as model violations of gene tree estimation error as well as substitution rates heterogeneity, and reconstructing subnetworks obtained from dividing a full network might be even more difficult than reconstructing the full taxa. The contributions of this thesis are below. First, I explore the approach to limit the search space of the network by inferring a phylogenetic tree with the addition of horizontal edges. I evaluate this tree-to-network augmentation phase under the minimizing deep coalescence and pseudo-likelihood criteria. I show that a recently developed divide-and-conquer approach significantly outperforms tree-based inference in terms of accuracy, albeit still at a higher computational cost. Second, I study statistical tests for assessing the fitness of gene trees to MSC with realistic gene tree error profiles, and developed a novel approach to determining the model complexity in the presence of gene tree estimation error. Third, I extend a Bayesian inference method MCMC_SEQ to solve the model misspecification caused by rate heterogeneity across loci that lead to spurious reticulations. Also, I study the effects of this model misspecification using simulation and an empirical dataset from Heliconius butterflies, as well as a summary method Infernetwork_ML. Fourth, in the presence of a scalable divide-and-conquer approach, which is promising but still challenging due to the demand for accurate and efficient sub-network inference, I explore inferring complex subnetworks accurately, as dividing a network into subnetworks can increase the difficulty of inference, and I improve the efficiency of MCMC_SEQ. I implement all the approaches in the publicly available open-source software package PhyloNet.
  • About R-3
  • Report a Digital Accessibility Issue
  • Request Accessible Formats
  • Fondren Library
  • Contact Us
  • FAQ
  • Privacy Notice
  • R-3 Policies

Physical Address:

6100 Main Street, Houston, Texas 77005

Mailing Address:

MS-44, P.O.BOX 1892, Houston, Texas 77251-1892