- Klau, Leesa J;
- Podell, Sheila;
- Creamer, Kaitlin E;
- Demko, Alyssa M;
- Singh, Hans W;
- Allen, Eric E;
- Moore, Bradley S;
- Ziemert, Nadine;
- Letzel, Anne Catrin;
- Jensen, Paul R
The Natural Product Domain Seeker (NaPDoS) webtool detects and classifies ketosynthase (KS) and condensation domains from genomic, metagenomic, and amplicon sequence data. Unlike other tools, a phylogeny-based classification scheme is used to make broader predictions about the polyketide synthase (PKS) and nonribosomal peptide synthetase (NRPS) genes in which these domains are found. NaPDoS is particularly useful for the analysis of incomplete biosynthetic genes or gene clusters, as are often observed in poorly assembled genomes and metagenomes, or when loci are not clustered, as in eukaryotic genomes. To help support the growing interest in sequence-based analyses of natural product biosynthetic diversity, here we introduce version 2 of the webtool, NaPDoS2, available at http://napdos.ucsd.edu/napdos2. This update includes the addition of 1417 KS sequences, representing a major expansion of the taxonomic and functional diversity represented in the webtool database. The phylogeny-based KS classification scheme now recognizes 41 class and subclass assignments, including new type II PKS subclasses. Workflow modifications accelerate run times, allowing larger datasets to be analyzed. In addition, default parameters were established using statistical validation tests to maximize KS detection and classification accuracy while minimizing false positives. We further demonstrate the applications of NaPDoS2 to assess PKS biosynthetic potential using genomic, metagenomic, and PCR amplicon datasets. These examples illustrate how NaPDoS2 can be used to predict biosynthetic potential and detect genes involved in the biosynthesis of specific structure classes or new biosynthetic mechanisms.