This thesis investigates the problem of fair statistical learning. We argue that critical notions of fairness can be represented by independence constraints on certain random variables, and take the approach of approximating independence by bounding moments. We propose a hierarchical Fair Optimization (FO) framework for generalized fair decision-making, prove desirable statistical properties, and extend the framework to a number of settings, ranging from supervised learning to unsupervised learning and hypothesis testing.
Algorithmic decision-making has steadily gained in prominence as more data is produced and computing resources become more abundant. However, it has been observed that these can often reflect, and perpetuate, biases apparent in the training data. To that end, we construct the FO framework as a general approach to statistical decision-making under fairness constraints. The framework revolves around bounding the moments between a ``score" function underlying the decision-making process and predefined ``protected" attributes. We prove that this framework is consistent and will thus asymptotically provide fair decision rules, and provide non-asymptotic bounds on how quickly the framework approaches truly fair decision-making rules. We also provide experimental results that show the efficacy of the FO hierarchy on a variety of datasets, and use it to construct fair, automated one-time and sequential dosage mechanisms for morphine and heparin.
Novel, adversarial notions of fairness are then defined for the problem of dimensionality reduction of data, and a Semidefinite Programming (SDP) relaxation of the FO hierarchy is defined that controls these notions. we provide experimental analysis, including a case study on insurance rate-setting that allows for mechanisms that are fair with respect to legally-motivated age restrictions. Similarly, we extend fairness to the problem of hypothesis testing, and make the connection between fairness and robustness in this realm. This is actuated in the form of a distributionally-robust dynamic watermarking scheme to detect attacks on dynamical systems. Finally, we extend the intuitions of data-dependent regularization underlying the FO hierarchy to design a data-dependent regularizer that promotes robustness in classifiers in the low-data regime when data lies in a low-dimensional manifold.