High-coverage nanopore sequencing of samples from the 1000 Genomes Project to build a comprehensive catalog of human genetic variation

dc.citation.firstpage2061en_US
dc.citation.issueNumber11en_US
dc.citation.journalTitleGenome Researchen_US
dc.citation.lastpage2073en_US
dc.citation.volumeNumber34en_US
dc.contributor.authorGustafson, Jonas A.en_US
dc.contributor.authorGibson, Sophia B.en_US
dc.contributor.authorDamaraju, Nikhitaen_US
dc.contributor.authorZalusky, Miranda P. G.en_US
dc.contributor.authorHoekzema, Kendraen_US
dc.contributor.authorTwesigomwe, Daviden_US
dc.contributor.authorYang, Leien_US
dc.contributor.authorSnead, Anthony A.en_US
dc.contributor.authorRichmond, Phillip A.en_US
dc.contributor.authorCoster, Wouter Deen_US
dc.contributor.authorOlson, Nathan D.en_US
dc.contributor.authorGuarracino, Andreaen_US
dc.contributor.authorLi, Qiuhuien_US
dc.contributor.authorMiller, Angela L.en_US
dc.contributor.authorGoffena, Joyen_US
dc.contributor.authorAnderson, Zachary B.en_US
dc.contributor.authorStorz, Sophie H. R.en_US
dc.contributor.authorWard, Sydney A.en_US
dc.contributor.authorSinha, Maishaen_US
dc.contributor.authorGonzaga-Jauregui, Claudiaen_US
dc.contributor.authorClarke, Wayne E.en_US
dc.contributor.authorBasile, Anna O.en_US
dc.contributor.authorCorvelo, Andréen_US
dc.contributor.authorReeves, Catherineen_US
dc.contributor.authorHelland, Adrienneen_US
dc.contributor.authorMusunuri, Rajeeva Lochanen_US
dc.contributor.authorRevsine, Mahleren_US
dc.contributor.authorPatterson, Karynne E.en_US
dc.contributor.authorPaschal, Cate R.en_US
dc.contributor.authorZakarian, Christinaen_US
dc.contributor.authorGoodwin, Saraen_US
dc.contributor.authorJensen, Tanner D.en_US
dc.contributor.authorRobb, Estheren_US
dc.contributor.authorConsortium, The 1000 Genomes ONT Sequencingen_US
dc.contributor.authorResearch (UW-CRDR), University of Washington Center for Rare Diseaseen_US
dc.contributor.authorConsortium, Genomics Research to Elucidate the Genetics of Rare Diseases (GREGoR)en_US
dc.contributor.authorMcCombie, William Richarden_US
dc.contributor.authorSedlazeck, Fritz J.en_US
dc.contributor.authorZook, Justin M.en_US
dc.contributor.authorMontgomery, Stephen B.en_US
dc.contributor.authorGarrison, Eriken_US
dc.contributor.authorKolmogorov, Mikhailen_US
dc.contributor.authorSchatz, Michael C.en_US
dc.contributor.authorMcLaughlin, Richard N.en_US
dc.contributor.authorDashnow, Harrieten_US
dc.contributor.authorZody, Michael C.en_US
dc.contributor.authorLoose, Matten_US
dc.contributor.authorJain, Mitenen_US
dc.contributor.authorEichler, Evan E.en_US
dc.contributor.authorMiller, Danny E.en_US
dc.date.accessioned2025-01-09T20:16:57Zen_US
dc.date.available2025-01-09T20:16:57Zen_US
dc.date.issued2024en_US
dc.description.abstractFewer than half of individuals with a suspected Mendelian or monogenic condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control data sets for variant filtering and prioritization has made tertiary analysis of LRS data challenging. To address this, the 1000 Genomes Project (1KGP) Oxford Nanopore Technologies Sequencing Consortium aims to generate LRS data from at least 800 of the 1KGP samples. Our goal is to use LRS to identify a broader spectrum of variation so we may improve our understanding of normal patterns of human variation. Here, we present data from analysis of the first 100 samples, representing all 5 superpopulations and 19 subpopulations. These samples, sequenced to an average depth of coverage of 37× and sequence read N50 of 54 kbp, have high concordance with previous studies for identifying single nucleotide and indel variants outside of homopolymer regions. Using multiple structural variant (SV) callers, we identify an average of 24,543 high-confidence SVs per genome, including shared and private SVs likely to disrupt gene function as well as pathogenic expansions within disease-associated repeats that were not detected using short reads. Evaluation of methylation signatures revealed expected patterns at known imprinted loci, samples with skewed X-inactivation patterns, and novel differentially methylated regions. All raw sequencing data, processed data, and summary statistics are publicly available, providing a valuable resource for the clinical genetics community to discover pathogenic SVs.en_US
dc.identifier.citationGustafson, J. A., Gibson, S. B., Damaraju, N., Zalusky, M. P. G., Hoekzema, K., Twesigomwe, D., Yang, L., Snead, A. A., Richmond, P. A., Coster, W. D., Olson, N. D., Guarracino, A., Li, Q., Miller, A. L., Goffena, J., Anderson, Z. B., Storz, S. H. R., Ward, S. A., Sinha, M., … Miller, D. E. (2024). High-coverage nanopore sequencing of samples from the 1000 Genomes Project to build a comprehensive catalog of human genetic variation. Genome Research, 34(11), 2061–2073. https://doi.org/10.1101/gr.279273.124en_US
dc.identifier.digitalGenomeRes-2024-Gustafson-2061-73en_US
dc.identifier.doihttps://doi.org/10.1101/gr.279273.124en_US
dc.identifier.urihttps://hdl.handle.net/1911/118104en_US
dc.language.isoengen_US
dc.publisherCold Spring Harbor Laboratory Pressen_US
dc.rightsExcept where otherwise noted, this work is licensed under a Creative Commons Attribution-NonCommercial (CC BY-NC) license. Permission to reuse, publish, or reproduce the work beyond the terms of the license or beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.en_US
dc.rights.urihttps://creativecommons.org/licenses/by-nc/4.0/en_US
dc.titleHigh-coverage nanopore sequencing of samples from the 1000 Genomes Project to build a comprehensive catalog of human genetic variationen_US
dc.typeJournal articleen_US
dc.type.dcmiTexten_US
dc.type.publicationpublisher versionen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
GenomeRes-2024-Gustafson-2061-73.pdf
Size:
1.38 MB
Format:
Adobe Portable Document Format