Methods for Elucidating and Utilizing Local Phylogenies in Phylogenomics

dc.contributor.advisorNakhleh, Luay Ken_US
dc.creatorElworth, Ryan Anthony Leoen_US
dc.date.accessioned2019-05-17T19:05:41Zen_US
dc.date.available2019-05-17T19:05:41Zen_US
dc.date.created2019-05en_US
dc.date.issued2019-04-18en_US
dc.date.submittedMay 2019en_US
dc.date.updated2019-05-17T19:05:42Zen_US
dc.description.abstractUnderstanding the evolutionary history of life on earth tells us about the origins of all life as well as giving insights into the underpinnings of human disease. While we continue to gather the DNA sequence data necessary to infer past evolutionary histories, the signal in this genomic data can be difficult to fully take advantage of. From a mathematical modeling perspective, the interplay of complex processes such as recombination, incomplete lineage sorting (ILS), and gene flow quickly complicate the generative process by which DNA sequences arise as a result of evolution. Computationally, the rapid growth in the generation of large amounts of sequence data necessitates efficient algorithms to infer past evolutionary histories. In particular, this thesis addresses the added challenges introduced by recombination breaking up genomes into localized regions whose evolutionary histories can disagree with one another. These localized evolutionary histories, known as local genealogies or local phylogenies, are interspersed throughout the genome in between regions affected by past recombination events. Local phylogenies can be difficult to infer in their own right. The signal for where recombination occurs and how the evolutionary histories of individual regions agree or disagree can be subtle. This same signal, however, can be used to infer important past events such as gene flow between species or even how genetic links to disease have evolved. In this thesis, I contribute to addressing these problems in the following ways. First, I introduce a new method for inferring local phylogenies at scale. This method leverages current state of the art tree building software to scan across a multiple sequence alignment and infer the localized evolutionary histories while simultaneously handling complications from recombination and low amounts of signal. Second, I introduce a new method for detecting when localized evolutionary histories were affected by past gene flow. For this work, I extend the theoretical framework of the D-statistic, used ubiquitously to scan for localized regions whose evolution was affected by gene flow, to handle arbitrarily complex gene flow scenarios with any number of sequences. Finally, I have disseminated the Automated Local Phylogenomic Analyses (ALPHA) toolkit with open source implementations of these methods as well as additional functionalities useful to biologists.en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationElworth, Ryan Anthony Leo. "Methods for Elucidating and Utilizing Local Phylogenies in Phylogenomics." (2019) Diss., Rice University. <a href="https://hdl.handle.net/1911/105999">https://hdl.handle.net/1911/105999</a>.en_US
dc.identifier.urihttps://hdl.handle.net/1911/105999en_US
dc.language.isoengen_US
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.en_US
dc.subjectRecombinationen_US
dc.subjectPhylogeneticsen_US
dc.subjectPhylogenomicsen_US
dc.subjectGenealogiesen_US
dc.titleMethods for Elucidating and Utilizing Local Phylogenies in Phylogenomicsen_US
dc.typeThesisen_US
dc.type.materialTexten_US
thesis.degree.departmentComputer Scienceen_US
thesis.degree.disciplineEngineeringen_US
thesis.degree.grantorRice Universityen_US
thesis.degree.levelDoctoralen_US
thesis.degree.nameDoctor of Philosophyen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ELWORTH-DOCUMENT-2019.pdf
Size:
3.54 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.61 KB
Format:
Plain Text
Description: