The last decade has witnessed rapid development and deployment of machine-learning systems across science. Such systems can supply predictions about scientific phenomena far more quickly and cheaply than gold-standard experiments, and are being used in efforts to both discover scientific knowledge and design new biomolecules. However, an important question remains unanswered: since machine-learning systems make errors, how can we use them in a trustworthy way for scientific discovery and design? This dissertation takes steps toward helping to ensure that the biomolecules we design and the scientific conclusions we draw using machine learning can be trusted.
We begin in the setting of machine learning-based design. The goal in this setting is to propose novel objects such as proteins, small molecules, or materials with desired properties, in a way that is guided by machine-learning models of such properties. Toward addressing model trustworthiness for design, we propose (i) a method for learning models that accounts for the distribution shifts inherent to design, and (ii) a method for constructing statistically valid confidence sets for the properties of objects designed using machine learning.
Finally, we examine the trustworthy use of machine learning for drawing scientific conclusions. In particular, we consider the increasingly relevant setting of treating predictions made by machine-learning systems as “data” in estimating quantities of scientific interest. We propose prediction-powered inference, a novel statistical framework for constructing valid confidence sets in this setting, which enables researchers to incorporate evidence from machine-learning systems into their scientific inquiry in a standardized and principled way.