Tracing the evolutionary history of pathogen outbreaks allows researchers to develop appropriate public health interventions. For example, phylogenetic inferences
have been key data informing the response to the on-going Covid-19 pandemic. I worked
with researchers at the CDC to develop and test tools to rapidly infer phylogenies for large
genomic data sets. I applied these new tools to understand the evolution of gonorrhea
(Neisseria gonorrhoeae), a pathogen of major public health importance, which is
increasing both in prevalence, and in rate of anti-microbial resistance. I found that our tools
reduced program runtime and data set fragmentation while producing reliable phylogenetic
estimates. I also investigated the underlying approach used by our methods to assemble
genomic sequences. I found that reference choice is an important consideration when
assembling sequences, as greater evolutionary distance to reference genome leads to an
increase in errors. However, I found that while errors increase with evolutionary distance
to reference genome, overall phylogenetic topology is largely unaffected. Finally, having
shown that my original tools are reliable, I extended the methods and applied them to
analyzing the evolutionary relationships of over 1,000 N. gonorrhoeae isolates in order to
map gain and loss of anti-microbial resistant alleles data. Together these results
demonstrate that the tools I have developed can be used to rapidly and accurately analyze
genome scale data for thousands of lineages, and link those evolutionary inferences with
important metadata to better inform public health interventions.