- Black, Gabrielle;
- Lowe, Charles;
- Anumol, Tarun;
- Bade, Jessica;
- Favela, Kristin;
- Feng, Yong-Lai;
- Knolhoff, Ann;
- Mceachran, Andrew;
- Nuñez, Jamie;
- Fisher, Christine;
- Peter, Kathy;
- Quinete, Natalia Soares;
- Sobus, Jon;
- Sussman, Eric;
- Watson, William;
- Wickramasekara, Samanthi;
- Williams, Antony;
- Young, Tom
Non-targeted analysis (NTA) using high-resolution mass spectrometry allows scientists to detect and identify a broad range of compounds in diverse matrices for monitoring exposure and toxicological evaluation without a priori chemical knowledge. NTA methods present an opportunity to describe the constituents of a sample across a multidimensional swath of chemical properties, referred to as "chemical space." Understanding and communicating which region of chemical space is extractable and detectable by an NTA workflow, however, remains challenging and non-standardized. For example, many sample processing and data analysis steps influence the types of chemicals that can be detected and identified. Accordingly, it is challenging to assess whether analyte non-detection in an NTA study indicates true absence in a sample (above a detection limit) or is a false negative driven by workflow limitations. Here, we describe the need for accessible approaches that enable chemical space mapping in NTA studies, propose a tool to address this need, and highlight the different ways in which it could be implemented in NTA workflows. We identify a suite of existing predictive and analytical tools that can be used in combination to generate scores that describe the likelihood a compound will be detected and identified by a given NTA workflow based on the predicted chemical space of that workflow. Higher scores correspond to a higher likelihood of compound detection and identification in a given workflow (based on sample extraction, data acquisition, and data analysis parameters). Lower scores indicate a lower probability of detection, even if the compound is truly present in the samples of interest. Understanding the constraints of NTA workflows can be useful for stakeholders when results from NTA studies are used in real-world applications and for NTA researchers working to improve their workflow performance. The hypothetical ChemSpaceTool suggested herein could be used in both a prospective and retrospective sense. Prospectively, the tool can be used to further curate screening libraries and set identification thresholds. Retrospectively, false detections can be filtered by the plausibility of the compound identification by the selected NTA method, increasing the confidence of unknown identifications. Lastly, this work highlights the chemometric needs to make such a tool robust and usable across a wide range of NTA disciplines and invites others who are working on various models to participate in the development of the ChemSpaceTool. Ultimately, the development of a chemical space mapping tool strives to enable further standardization of NTA by improving method transparency and communication around false detection rates, thus allowing for more direct method comparisons between studies and improved reproducibility. This, in turn, is expected to promote further widespread applications of NTA beyond research-oriented settings.