Simultaneous SNV calling and Phylogenetic Inference for Single-cell Sequencing Data

Date
2020-11-06
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract

Single-cell sequencing provides a powerful approach for elucidating intratumor heterogeneity by resolving cell-to-cell variability. However, it also poses additional challenges including elevated error rates, allelic dropout, and non-uniform coverage. Variant calling in this context is the task of identifying mutations in the genomes of individual cells while accounting for the multiple types of errors. One powerful approach for solving this task computationally is to rely on a phylogenetic context, since the genomes under analysis evolved from a common ancestor along the branches of a tree. The phylogenetic tree captures the temporal dependencies across the genomes and provides an important constraint that allows to distinguish true mutations from error that masquerades as mutation. However, this approach of simultaneously identifying mutations while accounting for the phylogenetic constraints is computationally challenging. In this thesis, I report on a new method that I developed, called scVILP, that jointly detects mutations in individual cells and reconstructs a “perfect phylogeny” of the cells (a phylogeny on which every site in the genomes mutates at most once). The method employs a novel Integer Linear Programming (ILP) formulation and utilizes publicly available ILP solvers. Furthermore, to address the scalability issue, I developed a divide-and-conquer technique, where the ILP formulation is applied to and solved on subsets of the data, and the results are combined while resolving conflicts via constraints that are also formulated in terms of ILP. I demonstrate through analysis of simulated data sets that my method has accuracy that is similar to or better than that of existing methods, and has significantly better runtime. My method provides a promising approach for analyzing large single-cell genomic data sets.

Description
Degree
Master of Science
Type
Thesis
Keywords
Single-cell sequencing, Perfect Phylogeny
Citation

Edrisi, Mohammadamin. "Simultaneous SNV calling and Phylogenetic Inference for Single-cell Sequencing Data." (2020) Master’s Thesis, Rice University. https://hdl.handle.net/1911/109582.

Has part(s)
Forms part of
Published Version
Rights
Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.
Link to license
Citable link to this page