Methods for Elucidating and Utilizing Local Phylogenies in Phylogenomics

Date
2019-04-18
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract

Understanding the evolutionary history of life on earth tells us about the origins of all life as well as giving insights into the underpinnings of human disease. While we continue to gather the DNA sequence data necessary to infer past evolutionary histories, the signal in this genomic data can be difficult to fully take advantage of. From a mathematical modeling perspective, the interplay of complex processes such as recombination, incomplete lineage sorting (ILS), and gene flow quickly complicate the generative process by which DNA sequences arise as a result of evolution. Computationally, the rapid growth in the generation of large amounts of sequence data necessitates efficient algorithms to infer past evolutionary histories. In particular, this thesis addresses the added challenges introduced by recombination breaking up genomes into localized regions whose evolutionary histories can disagree with one another. These localized evolutionary histories, known as local genealogies or local phylogenies, are interspersed throughout the genome in between regions affected by past recombination events.

Local phylogenies can be difficult to infer in their own right. The signal for where recombination occurs and how the evolutionary histories of individual regions agree or disagree can be subtle. This same signal, however, can be used to infer important past events such as gene flow between species or even how genetic links to disease have evolved. In this thesis, I contribute to addressing these problems in the following ways. First, I introduce a new method for inferring local phylogenies at scale. This method leverages current state of the art tree building software to scan across a multiple sequence alignment and infer the localized evolutionary histories while simultaneously handling complications from recombination and low amounts of signal. Second, I introduce a new method for detecting when localized evolutionary histories were affected by past gene flow. For this work, I extend the theoretical framework of the D-statistic, used ubiquitously to scan for localized regions whose evolution was affected by gene flow, to handle arbitrarily complex gene flow scenarios with any number of sequences. Finally, I have disseminated the Automated Local Phylogenomic Analyses (ALPHA) toolkit with open source implementations of these methods as well as additional functionalities useful to biologists.

Description
Degree
Doctor of Philosophy
Type
Thesis
Keywords
Recombination, Phylogenetics, Phylogenomics, Genealogies
Citation

Elworth, Ryan Anthony Leo. "Methods for Elucidating and Utilizing Local Phylogenies in Phylogenomics." (2019) Diss., Rice University. https://hdl.handle.net/1911/105999.

Has part(s)
Forms part of
Published Version
Rights
Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.
Link to license
Citable link to this page