Understanding treatment effect heterogeneity has been an increasingly important task in variousfields. Treatment effect heterogeneity not only adds granularity to the understanding of
everyday matters but also assists better-informed decision-making on many scientific frontiers.
In biomedical studies, learning treatment effect heterogeneity helps clinicians to apply
personalized treatments to patient subpopulations with different genetic profiles. Instead of
prescribing one drug for all, refined prescription strategies can potentially improve patients’
overall welfare. In social science studies, evaluating the treatment effect heterogeneity of
candidate policies provides guidance for policymakers to implement future social programs.
In technology companies, understanding treatment effect heterogeneity helps decision-makers
to depict market segregation so that advertisement budgets can be strategically allocated
to particular consumer subpopulations among which a new product is more likely to earn
profits. This dissertation provides a set of statistical methodologies for understanding the treatment
effect heterogeneity and is organized into three chapters with three separate aims: (1) estimating
treatment effect heterogeneity, (2) confirming treatment effect heterogeneity, and
(3) designing adaptive experiments toward learning treatment effect heterogeneity
Chapter 1 introduces a statistical methodology aiming to estimate treatment effect heterogeneity
efficiently. We take a model-free semiparametric perspective and aim to efficiently
evaluate the heterogeneous treatment effects of multiple subgroups simultaneously under
the one-step targeted maximum-likelihood estimation framework. When the number of subgroups
is large, we further expand this path of research by looking at a variation of the
one-step TMLE that is robust to the presence of small estimated propensity scores in finite
samples. Chapter 2 proposes a statistical methodology for confirming the estimated heterogeneous
treatment effects. Understanding the impact of the most effective treatments on outcome
variables is crucial in various disciplines. Due to the widespread winner’s curse phenomenon,
conventional statistical inference assuming that the top policies are chosen independent of
the random sample may lead to overly optimistic evaluations of the best policies. In addition, given the increased availability of large datasets, such an issue can be further complicated
when researchers include many covariates to estimate the policy or treatment effects in an
attempt to control for potential confounders. To simultaneously address the above-mentioned
issues, we propose a resampling-based procedure that not only lifts the winner’s curse in
evaluating the best policies observed in a random sample but also is robust to the presence
of many covariates. The proposed inference procedure yields accurate point estimates and
valid frequentist confidence intervals that achieve the exact nominal level as the sample size
goes to infinity for multiple best policy effect sizes. Chapter 3 provides an alternative perspective of studying the treatment effect heterogeneity. While much of the existing work in this research area has focused on either analyzing
observational data based on untestable causal assumptions or conducting post hoc analyses
of existing randomized controlled trial data, little work has gone into designing randomized
experiments specifically for uncovering treatment effect heterogeneity. In this chapter, we
develop a unified adaptive experimental design framework towards better learning treatment
effect heterogeneity by efficiently identifying subgroups with enhanced treatment effects from
a frequentist viewpoint. The adaptive nature of our framework allows practitioners to sequentially
allocate experimental efforts adapting to the accrued evidence during the experiment.
The resulting design framework can not only complement A/B tests in e-commerce but also
unify enrichment designs and response adaptive randomization designs in clinical settings.
Our theoretical investigations illustrate the trade-offs between complete randomization and
our adaptive experimental algorithms.