Skip to main content
eScholarship
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

Topics in Evidence Synthesis

Abstract

This dissertation considers three different topics related to extracting and merging evidence from heterogeneous sources. This problem is addressed from different angles, from the field of design of experiment to machine learning.

Within this dissertation, we add to the existing literature in each area by developing novel methodology and software.

Adaptive trial designs can considerably improve upon traditional designs,

by modifying design aspects of the ongoing trial, like early stopping,

adding or dropping doses, or changing the sample size.

We propose a two-stage Bayesian adaptive design for a Phase IIb study aimed at selecting the lowest effective dose for Phase III. In this setting, efficacy has been proved for a high dose in a Phase IIa proof-of-concept study, but the existence of a

lower but still effective dose is investigated before the scheduled Phase III starts.

In the first stage patients are randomized to placebo, maximal

tolerated dose, and one or more additional doses within the dose

range. Based on an interim analysis, the study is either stopped for

futility or success, or enters the second stage, where newly recruited

patients are allocated to placebo, some fairly high dose, and one

additional dose chosen based on interim data. At the interim analysis

criteria based on the predictive probability of success are used to

decide on whether to stop or to continue the trial, and, in the latter

case, which dose to select for the second stage.

Finally, a dose will be selected as lowest effective dose for Phase III

either at the end of the first or at the end of the second stage.

The operating characteristics of the procedure are evaluated via

simulations and results are presented for several scenarios comparing

the performance of the proposed procedure to those of the non adaptive

design.

The development of novel therapies in multiple sclerosis (MS) is one area where a range of surrogate

outcomes are used in various stages of clinical research. While the aim of treatments in MS is to prevent

disability, a clinical trial for evaluating a drugs effect on disability progression would require a large

sample of patients with many years of follow-up. The early stage of MS is characterized by relapses. To

reduce study size and duration, clinical relapses are accepted as primary endpoints in phase III trials. For

phase II studies, the primary outcomes are typically lesion counts based on Magnetic Resonance Imaging

(MRI), as these are considerably more sensitive than clinical measures for detecting MS activity.

Recently, Sormani and colleagues \cite{sormani2010surrogate} provided a systematic review, and

used weighted regression analyses to examine the role of either MRI lesions or relapses as trial level

surrogate outcomes for disability. We build on this work by developing a Bayesian three-level model,

accommodating the two surrogates and the disability endpoint, and properly taking into account that

treatment effects are estimated with errors. Specifically, a combination of treatment effects based on

MRI lesion count outcomes and clinical relapse, both expressed on the log risk ratio scale, were used to

develop a study level surrogate outcome model for the corresponding treatment effects based on

disability progression. While the primary aim for developing this model was to support decision making

in drug development, the proposed model may also be considered for future validation.

In Genomics and Epidemiology we deal with a high number of features for each observation. Many well known approaches to drawing inferences in this kind of settings use the topology of the feature space, induced by an appropriate metric, to group observations and summarize their main characteristics to get rid of the noise and to predict an outcome of interest. In the present work we generalize this approach in the context of Loss-Based Estimation. We propose an alternative method for constructing a nonparametric multidimensional regression function. This approach is based on the simple idea of clustering data points in the feature space and then fitting a constant to the outcome. HOPACH-PAM is used for partition. This approach results in the choice of a small number of distinct regions easy to interpret. This is specifically illustrated by simulations from which we can see immediately the superiority of this method on CART. Pre-screening and feature selections methods are also developed to improve the performances and reduce the noise. Software is also available in the R package HOPSLAM (HOpach-Pam Supervised Learning AlgorithM) to make this methodology easily accessible.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View