Towards Accurate Reconstruction of Phylogenetic Networks

dc.contributor.advisorNakhleh, Luay K.en_US
dc.contributor.committeeMemberKohn, Michael H.en_US
dc.contributor.committeeMemberJermaine, Christopher M.en_US
dc.creatorPark, HyunJungen_US
dc.date.accessioned2012-09-06T04:43:46Zen_US
dc.date.accessioned2012-09-06T04:43:52Zen_US
dc.date.available2012-09-06T04:43:46Zen_US
dc.date.available2012-09-06T04:43:52Zen_US
dc.date.created2012-05en_US
dc.date.issued2012-09-05en_US
dc.date.submittedMay 2012en_US
dc.date.updated2012-09-06T04:43:52Zen_US
dc.description.abstractSince Darwin proposed that all species on the earth have evolved from a common ancestor, evolution has played an important role in understanding biology. While the evolutionary relationships/histories of genes are represented using trees, the genomic evolutionary history may not be adequately captured by a tree, as some evolutionary events, such as horizontal gene transfer (HGT), do not fit within the branches of a tree. In this case, phylogenetic networks are more appropriate for modeling evolutionary histories. In this dissertation, we present computational algorithms to reconstruct phylogenetic networks from different types of data. Under the assumption that species have single copies of genes, and HGT and speciation are the only events through the course of evolution, gene sequences can be sampled one copy per species for HGT detection. Given the alignments of the sequences, we propose systematic methods that estimate the significance of detected HGT events under maximum parsimony (MP) and maximum likelihood (ML). The estimated significance aims at addressing the issue of overestimation of both optimization criteria in the search for phylogenetic networks and helps the search identify networks with the ``right" number of HGT edges. We study their performance on both synthetic and biological data sets. While the studies show very promising results in identifying HGT edges, they also highlight the issues that are challenging for each criterion. We also develop algorithms that estimate the amount of HGT events and reconstruct phylogenetic networks by utilizing the pairwise Subtree-Prune-Regraft (SPR) operation from a collection of trees. The methods produce good results in general in terms of quickly estimating the minimum number of HGT events required to reconcile a set of trees. Further, we identify conditions under which the methods do not work well in order to help in the development of new methods in this area. Finally, we extend the assumption for the genetic evolutionary process and allow for duplication and loss. Under this assumption, we analyze gene family trees of proteobacterial strains using a parsimony-based approach to detect evolutionary events. Also we discuss the current issues of parsimony-based approaches in the biological data analysis and propose a way to retrieve significant estimates. The evolutionary history of species is complex with various evolutionary events. As HGT contributes largely to this complexity, accurately identifying HGT will help untangle evolutionary histories and solve important questions. As our algorithms identify significant HGT events in the data and reconstruct accurate phylogenetic networks from them, they can be used to address questions arising in large-scale biological data analyses.en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationPark, HyunJung. "Towards Accurate Reconstruction of Phylogenetic Networks." (2012) Diss., Rice University. <a href="https://hdl.handle.net/1911/64705">https://hdl.handle.net/1911/64705</a>.en_US
dc.identifier.slug123456789/ETD-2012-05-181en_US
dc.identifier.urihttps://hdl.handle.net/1911/64705en_US
dc.language.isoengen_US
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.en_US
dc.subjectComputational biologyen_US
dc.subjectBioinformaticsen_US
dc.subjectEvolutionen_US
dc.subjectPhylogenetic networksen_US
dc.subjectReticulate evolutionary eventen_US
dc.titleTowards Accurate Reconstruction of Phylogenetic Networksen_US
dc.typeThesisen_US
dc.type.materialTexten_US
thesis.degree.departmentComputer Scienceen_US
thesis.degree.disciplineEngineeringen_US
thesis.degree.grantorRice Universityen_US
thesis.degree.levelDoctoralen_US
thesis.degree.nameDoctor of Philosophyen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
PARK-THESIS.pdf
Size:
5.8 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.61 KB
Format:
Item-specific license agreed upon to submission
Description: