- Tessema, Sofonias;
- Hathaway, Nicholas;
- Teyssier, Noam;
- Murphy, Maxwell;
- Chen, Anna;
- Aydemir, Ozkan;
- Duarte, Elias;
- Simone, Wilson;
- Colborn, James;
- Saute, Francisco;
- Crawford, Emily;
- Aide, Pedro;
- Bailey, Jeffrey;
- Greenhouse, Bryan
BACKGROUND: Targeted next-generation sequencing offers the potential for consistent, deep coverage of information-rich genomic regions to characterize polyclonal Plasmodium falciparum infections. However, methods to identify and sequence these genomic regions are currently limited. METHODS: A bioinformatic pipeline and multiplex methods were developed to identify and simultaneously sequence 100 targets and applied to dried blood spot (DBS) controls and field isolates from Mozambique. For comparison, whole-genome sequencing data were generated for the same controls. RESULTS: Using publicly available genomes, 4465 high-diversity genomic regions suited for targeted sequencing were identified, representing the P. falciparum heterozygome. For this study, 93 microhaplotypes with high diversity (median expected heterozygosity = 0.7) were selected along with 7 drug resistance loci. The sequencing method achieved very high coverage (median 99%), specificity (99.8%), and sensitivity (90% for haplotypes with 5% within sample frequency in dried blood spots with 100 parasites/µL). In silico analyses revealed that microhaplotypes provided much higher resolution to discriminate related from unrelated polyclonal infections than biallelic single-nucleotide polymorphism barcodes. CONCLUSIONS: The bioinformatic and laboratory methods outlined here provide a flexible tool for efficient, low-cost, high-throughput interrogation of the P. falciparum genome, and can be tailored to simultaneously address multiple questions of interest in various epidemiological settings.