Skip to main content
eScholarship
Open Access Publications from the University of California

A metagenomic perspective on the microbial prokaryotic genome census.

Abstract

Following 30 years of sequencing, we assessed the phylogenetic diversity (PD) of >1.5 million microbial genomes in public databases, including metagenome-assembled genomes (MAGs) of uncultivated microbes. As compared to the vast diversity uncovered by metagenomic sequences, cultivated taxa account for a modest portion of the overall diversity, 9.73% in bacteria and 6.55% in archaea, while MAGs contribute 48.54% and 57.05%, respectively. Therefore, a substantial fraction of bacterial (41.73%) and archaeal PD (36.39%) still lacks any genomic representation. This unrepresented diversity manifests primarily at lower taxonomic ranks, exemplified by 134,966 species identified in 18,087 metagenomic samples. Our study exposes diversity hotspots in freshwater, marine subsurface, sediment, soil, and other environments, whereas human samples yielded minimal novelty within the context of existing datasets. These results offer a roadmap for future genome recovery efforts, delineating uncaptured taxa in underexplored environments and underscoring the necessity for renewed isolation and sequencing.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View