Vulcan: Improved long-read mapping and structural variant calling via dual-mode alignment

Fu, Yilei; Mahmoud, Medhat; Muraliraman, Viginesh Vaibhav; Sedlazeck, Fritz J; Treangen, Todd J

Vulcan: Improved long-read mapping and structural variant calling via dual-mode alignment

dc.citation.articleNumber	giab063	en_US
dc.citation.issueNumber	9	en_US
dc.citation.journalTitle	GigaScience	en_US
dc.citation.volumeNumber	10	en_US
dc.contributor.author	Fu, Yilei	en_US
dc.contributor.author	Mahmoud, Medhat	en_US
dc.contributor.author	Muraliraman, Viginesh Vaibhav	en_US
dc.contributor.author	Sedlazeck, Fritz J	en_US
dc.contributor.author	Treangen, Todd J	en_US
dc.date.accessioned	2021-10-28T18:47:24Z	en_US
dc.date.available	2021-10-28T18:47:24Z	en_US
dc.date.issued	2021	en_US
dc.description.abstract	Long-read sequencing has enabled unprecedented surveys of structural variation across the entire human genome. To maximize the potential of long-read sequencing in this context, novel mapping methods have emerged that have primarily focused on either speed or accuracy. Various heuristics and scoring schemas have been implemented in widely used read mappers (minimap2 and NGMLR) to optimize for speed or accuracy, which have variable performance across different genomic regions and for specific structural variants. Our hypothesis is that constraining read mapping to the use of a single gap penalty across distinct mutational hot spots reduces read alignment accuracy and impedes structural variant detection.We tested our hypothesis by implementing a read-mapping pipeline called Vulcan that uses two distinct gap penalty modes, which we refer to as dual-mode alignment. The high-level idea is that Vulcan leverages the computed normalized edit distance of the mapped reads via minimap2 to identify poorly aligned reads and realigns them using the more accurate yet computationally more expensive long-read mapper (NGMLR). In support of our hypothesis, we show that Vulcan improves the alignments for Oxford Nanopore Technology long reads for both simulated and real datasets. These improvements, in turn, lead to improved accuracy for structural variant calling performance on human genome datasets compared to either of the read-mapping methods alone.Vulcan is the first long-read mapping framework that combines two distinct gap penalty modes for improved structural variant recall and precision. Vulcan is open-source and available under the MIT License at https://gitlab.com/treangenlab/vulcan.	en_US
dc.identifier.citation	Fu, Yilei, Mahmoud, Medhat, Muraliraman, Viginesh Vaibhav, et al.. "Vulcan: Improved long-read mapping and structural variant calling via dual-mode alignment." <i>GigaScience,</i> 10, no. 9 (2021) Oxford University Press: https://doi.org/10.1093/gigascience/giab063.	en_US
dc.identifier.digital	giab063	en_US
dc.identifier.doi	https://doi.org/10.1093/gigascience/giab063	en_US
dc.identifier.uri	https://hdl.handle.net/1911/111619	en_US
dc.language.iso	eng	en_US
dc.publisher	Oxford University Press	en_US
dc.rights	This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.	en_US
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	en_US
dc.title	Vulcan: Improved long-read mapping and structural variant calling via dual-mode alignment	en_US
dc.type	Journal article	en_US
dc.type.dcmi	Text	en_US
dc.type.publication	publisher version	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: giab063.pdf
Size:: 1.79 MB
Format:: Adobe Portable Document Format

Download

Collections

Faculty Publications
Computer Science Publications