Computational Methods for Analyses of Single-cell DNA Sequencing Data in Cancer

dc.contributor.advisorNakhleh, Luay
dc.creatorEdrisi, Mohammadamin
dc.date.accessioned2024-05-22T16:15:10Z
dc.date.available2024-05-22T16:15:10Z
dc.date.created2024-05
dc.date.issued2024-04-16
dc.date.submittedMay 2024
dc.date.updated2024-05-22T16:15:10Z
dc.description.abstractThe study of cancer using single-cell sequencing technology has opened up exciting new avenues for understanding the genomic complexity and heterogeneity of this disease. However, the analysis of such data presents computational challenges both in terms of designing novel mathematical models for biological discovery as well as devising new methods that are scalable to the newly emerged large-scale single-cell sequencing data. Throughout my Ph.D. studies, I focused on multiple research projects, each of which aimed to address such computational challenges in analyzing single-cell sequencing data in the context of cancer. In this thesis, I present my contributions to three studies and their corresponding methods, including Phylovar for phylogeny-aware detection of single-nucleotide variations (SNVs), MoTERNN for classifying the mode of cancer evolution, and MaCroDNA for integrating high-throughput single-cell DNA and RNA sequencing data. In Phylovar, I improved the joint inference of cancer cells' SNVs (a common type of mutation in cancer) and their phylogeny, an approach known as phylogeny-aware SNV detection. Although this approach is highly accurate, its scalability to large-scale single-cell sequencing datasets was limited. To address this, I introduced a novel vectorized formulation for computing the likelihood function of this model, achieving very good improvement in calculation speed, enabling us to scale up accurate SNV detection from hundreds to millions of genomic loci suitable for the fast-expanding datasets from single-cell whole-genome and whole-exome sequencing technologies. MoTERNN is aimed at determining modes of cancer evolution—linear, branching, neutral, or punctuated—each indicative of specific evolution patterns critical for diagnosis, prognosis, and treatment strategies. I treated this as a graph classification problem, using phylogenetic trees as graphs and evolution modes as classes, and employed Recursive Neural Networks (RvNNs) for classification. As the first application of RvNNs to phylogenetics, MoTERNN demonstrated very high accuracy in both the training and testing phases, showcasing the potential of RvNNs for learning on phylogenetic trees. In the MaCroDNA project, I aimed to link DNA mutations to their impacts on RNA changes by pairing the cells that have been sequenced for either DNA or RNA data alone. In this work, I employed a maximum weighted bipartite matching algorithm for assigning the cells from the two data domains so that the sum of the Pearson correlation between all pairs is maximized. MaCroDNA achieved very good accuracy and outperformed the state-of-the-art method by a large margin.
dc.format.mimetypeapplication/pdf
dc.identifier.citationEdrisi, Mohammadamin. Computational Methods for Analyses of Single-cell DNA Sequencing Data in Cancer. (2024). PhD diss., Rice University. https://hdl.handle.net/1911/116189
dc.identifier.urihttps://hdl.handle.net/1911/116189
dc.language.isoeng
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.
dc.subjectSingle-cell DNA sequencing analysis
dc.subjectcancer evolutionary biology
dc.subjectSingle-nucleotide variation detection
dc.subjectSingle-cell RNA sequencing analysis
dc.subjectSingle-cell multi-omics integration
dc.subjectMaximum-likelihood estimation
dc.subjectRecursive Neural Networks
dc.subjectMaximum weighted bipartite matching
dc.titleComputational Methods for Analyses of Single-cell DNA Sequencing Data in Cancer
dc.typeThesis
dc.type.materialText
thesis.degree.departmentComputer Science
thesis.degree.disciplineNatural Sciences
thesis.degree.grantorRice University
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
EDRISI-DOCUMENT-2024.pdf
Size:
78.13 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.85 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.98 KB
Format:
Plain Text
Description: