A catalog of tens of thousands of viruses from human metagenomes reveals hidden associations with chronic diseases.

Published on Jun 3, 2021in Proceedings of the National Academy of Sciences of the United States of America9.412
· DOI :10.1073/PNAS.2023202118
Michael J. Tisza7
Estimated H-index: 7
(NIH: National Institutes of Health),
Christopher B. Buck52
Estimated H-index: 52
(NIH: National Institutes of Health)
Despite remarkable strides in microbiome research, the viral component of the microbiome has generally presented a more challenging target than the bacteriome. This gap persists, even though many thousands of shotgun sequencing runs from human metagenomic samples exist in public databases, and all of them encompass large amounts of viral sequence data. The lack of a comprehensive database for human-associated viruses has historically stymied efforts to interrogate the impact of the virome on human health. This study probes thousands of datasets to uncover sequences from over 45,000 unique virus taxa, with historically high per-genome completeness. Large publicly available case-control studies are reanalyzed, and over 2,200 strong virus-disease associations are found.
#1Stephen Nayfach (LBNL: Lawrence Berkeley National Laboratory)H-Index: 18
#2Antonio P. Camargo (State University of Campinas)H-Index: 6
Last. Nikos C. Kyrpides (LBNL: Lawrence Berkeley National Laboratory)H-Index: 100
view all 6 authors...
Millions of new viral sequences have been identified from metagenomes, but the quality and completeness of these sequences vary considerably. Here we present CheckV, an automated pipeline for identifying closed viral genomes, estimating the completeness of genome fragments and removing flanking host regions from integrated proviruses. CheckV estimates completeness by comparing sequences with a large database of complete viral genomes, including 76,262 identified from a systematic search of publi...
15 CitationsSource
#1Moïra B. Dion (Laval University)H-Index: 5
#2Pier-Luc Plante (Laval University)H-Index: 10
Last. Sylvain Moineau (Laval University)H-Index: 74
view all 6 authors...
Thousands of new phages have recently been discovered thanks to viral metagenomics. These phages are extremely diverse and their genome sequences often do not resemble any known phages. To appreciate their ecological impact, it is important to determine their bacterial hosts. CRISPR spacers can be used to predict hosts of unknown phages, as spacers represent biological records of past phage-bacteria interactions. However, no guidelines have been established to standardize host prediction based o...
#1Luis Fernando Camarillo-Guerrero (Wellcome Trust Sanger Institute)H-Index: 2
#2Alexandre Almeida (EMBL-EBI: European Bioinformatics Institute)H-Index: 15
Last. Trevor D. Lawley (Wellcome Trust Sanger Institute)H-Index: 55
view all 5 authors...
Summary Bacteriophages drive evolutionary change in bacterial communities by creating gene flow networks that fuel ecological adaptions. However, the extent of viral diversity and its prevalence in the human gut remains largely unknown. Here, we introduce the Gut Phage Database, a collection of ∼142,000 non-redundant viral genomes (>10 kb) obtained by mining a dataset of 28,060 globally distributed human gut metagenomes and 2,898 reference genomes of cultured gut bacteria. Host assignment reveal...
8 CitationsSource
#1Eugene V. Koonin (NIH: National Institutes of Health)H-Index: 218
#2Kira S. Makarova (NIH: National Institutes of Health)H-Index: 100
Last. Yuri I. Wolf (NIH: National Institutes of Health)H-Index: 107
view all 3 authors...
Prokaryote genomics started in earnest in 1995, with the complete sequences of two small bacterial genomes, those of Haemophilus influenzae and Mycoplasma genitalium. During the next quarter century, the prokaryote genome database has been growing exponentially, with no saturation in sight. For most of these 25 years, genome sequencing remained limited to cultivable microbes. Together with next-generation sequencing methods, advances in metagenomics and single-cell genomics have lifted this limi...
#1Michael J. TiszaH-Index: 7
#2Anna K. BelfordH-Index: 4
Last. Christopher B. BuckH-Index: 52
view all 5 authors...
Viruses, despite their great abundance and significance in biological systems, remain largely mysterious. Indeed, the vast majority of the perhaps hundreds of millions of viral species on the planet remain undiscovered. Additionally, many viruses deposited in central databases like GenBank and RefSeq are littered with genes annotated as 'hypothetical protein' or the equivalent. Cenote-Taker 2, a virus discovery and annotation tool available on command line and with a graphical user interface wit...
#1Philip M Nussenzweig (Kettering University)H-Index: 4
#1Philip M. Nussenzweig (Kettering University)H-Index: 1
Last. Luciano A. Marraffini (Rockefeller University)H-Index: 45
view all 2 authors...
Prokaryotes have developed numerous defense strategies to combat the constant threat posed by the diverse genetic parasites that endanger them. Clustered regularly interspaced short palindromic rep...
5 CitationsSource
#1Christopher M Bellas (University of Innsbruck)H-Index: 10
#2Declan C. Schroeder (University of Reading)H-Index: 34
Last. Alexandre M. Anesio (AU: Aarhus University)H-Index: 53
view all 5 authors...
Bacteriophage genomes rapidly evolve via mutation and horizontal gene transfer to counter evolving bacterial host defenses; such arms race dynamics should lead to divergence between phages from similar, geographically isolated ecosystems. However, near-identical phage genomes can reoccur over large geographical distances and several years apart, conversely suggesting many are stably maintained. Here, we show that phages with near-identical core genomes in distant, discrete aquatic ecosystems mai...
3 CitationsSource
#1Ann C. Gregory (OSU: Ohio State University)H-Index: 14
#2Olivier Zablocki (OSU: Ohio State University)H-Index: 9
Last. Matthew B. Sullivan (OSU: Ohio State University)H-Index: 75
view all 6 authors...
The gut microbiome profoundly affects human health and disease, and their infecting viruses are likely as important, but often missed because of reference database limitations. Here, we (1) built a human Gut Virome Database (GVD) from 2,697 viral particle or microbial metagenomes from 1,986 individuals representing 16 countries, (2) assess its effectiveness, and (3) report a meta-analysis that reveals age-dependent patterns across healthy Westerners. The GVD contains 33,242 unique viral populati...
37 CitationsSource
#1Aurélie Fluckiger (French Institute of Health and Medical Research)H-Index: 8
#2Romain Daillère (French Institute of Health and Medical Research)H-Index: 11
Last. Laurence ZitvogelH-Index: 148
view all 52 authors...
Intestinal microbiota have been proposed to induce commensal-specific memory T cells that cross-react with tumor-associated antigens. We identified major histocompatibility complex (MHC) class I-binding epitopes in the tail length tape measure protein (TMP) of a prophage found in the genome of the bacteriophage Enterococcus hirae Mice bearing E. hirae harboring this prophage mounted a TMP-specific H-2Kb-restricted CD8+ T lymphocyte response upon immunotherapy with cyclophosphamide or anti-PD-1 a...
28 CitationsSource
#1Patrick Pausch (University of California, Berkeley)H-Index: 11
#1Patrick Pausch (University of California, Berkeley)H-Index: 1
Last. Jennifer A. DoudnaH-Index: 132
view all 10 authors...
CRISPR-Cas systems are found widely in prokaryotes, where they provide adaptive immunity against virus infection and plasmid transformation. We describe a minimal functional CRISPR-Cas system, comprising a single ~70-kilodalton protein, CasΦ, and a CRISPR array, encoded exclusively in the genomes of huge bacteriophages. CasΦ uses a single active site for both CRISPR RNA (crRNA) processing and crRNA-guided DNA cutting to target foreign nucleic acids. This hypercompact system is active in vitro an...
56 CitationsSource
Cited By0