A novel computational platform for sensitive, accurate, and efficient screening of nucleic acids

dc.contributor.advisorTreaangen, Todd
dc.creatorAlbin, Dreycey Don
dc.date.accessioned2020-05-07T16:06:17Z
dc.date.available2020-05-07T16:06:17Z
dc.date.created2020-05
dc.date.issued2020-04-24
dc.date.submittedMay 2020
dc.date.updated2020-05-07T16:06:17Z
dc.description.abstractRecent advances in the field of synthetic biology and nucleic acid synthesis, coupled with increasing concerns about its intentional or accidental misuse, require more sophisticated screening tools to identify genes of interest within short sequence fragments. One major limitation in predicting DNA sequences of concern is the inadequacy of current computational tools and ontologies to describe the specific biological processes of pathogenic proteins. In the first part of this thesis, we design and implement a novel computational platform, SeqScreen, that sensitively assigns taxonomic classifications, functional annotations, and biological processes of interest to short nucleotide sequences of unknown origin (50bp-1,000bp). The overarching goal is to perform sensitive characterization of short sequences and highlight specific pathogenic biological processes of interest (BPoIs). The SeqScreen software executes these tasks in analytical workflows and outputs results in a tab-delimited report. In the second part, we perform a deep computational dive into the area of taxonomic classification, specifically focusing on biases caused by differences in sequences they contain, which radically change over time and differ significantly from repository to repository. To mitigate these drawbacks, the Database Query Tool (DQT) is presented as an effective, easy-to-use, method to investigate the taxonomic composition of databases commonly used in metagenomics. It outputs the databases and related versions that contain a given input NCBI taxonomic ID, allowing for a user to decide what database to use for a given sample, as well as a method for post-analysis. In summary, we provide two novel computational tools for sensitive and accurate characterization of nucleic acid sequences.
dc.format.mimetypeapplication/pdf
dc.identifier.citationAlbin, Dreycey Don. "A novel computational platform for sensitive, accurate, and efficient screening of nucleic acids." (2020) Master’s Thesis, Rice University. <a href="https://hdl.handle.net/1911/108636">https://hdl.handle.net/1911/108636</a>.
dc.identifier.urihttps://hdl.handle.net/1911/108636
dc.language.isoeng
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.
dc.subjectSynthetic Biology
dc.subjectBioinformatics
dc.subjectMetagenomics
dc.subjectComputational Biology
dc.titleA novel computational platform for sensitive, accurate, and efficient screening of nucleic acids
dc.typeThesis
dc.type.materialText
thesis.degree.departmentSystems, Synthetic and Physical Biology
thesis.degree.disciplineNatural Sciences
thesis.degree.grantorRice University
thesis.degree.levelMasters
thesis.degree.nameMaster of Science
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ALBIN-DOCUMENT-2020.pdf
Size:
2.97 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.61 KB
Format:
Plain Text
Description: