- Main
Topics in Current Status Data
- McKeown, Karen Michelle
- Advisor(s): Jewell, Nicholas P
Abstract
This dissertation considers topics in current status data, a type of survival data where the only available information on the survival time is whether or not the event time has occurred before the examination time. We introduce the concept of current status data and give some motivating examples to highlight some of the many areas in which this type of data naturally occur in practice. We discuss some of the well known and widely used methods for analyzing current status data, along with some of the more recent developments in the area, and provide appropriate references to these previously examined methods. Within this dissertation, we add to the existing literature in the area by developing ideas not previously addressed from a current status data perspective.
We describe a simple method for nonparametric estimation of a distribution function based on current status data where observations of current status information are subject to (known) misclassification. Nonparametric maximum likelihood techniques are obtained through the use of a straightforward set of adjustments to the familiar pool-adjacent violators algorithm, which is generally used when misclassification is assumed absent. The methods are extended to allow for misclassification rates that vary over time, particularly when misclassification is most likely to occur close to the time of the true failure event. Using the ideas of binary generalized linear models with outcomes subject to misclassification we consider regression models for the underlying survival time. The ideas are motivated by and applied to an example on human papillomavirus (HPV) infection status amongst women examined in San Francisco. Additional applications on breastfeeding behaviors and menopausal status are also presented. As an extension we consider group testing with current status data in the presence of misclassification. Group testing combines samples, such as blood or urine, from a number of individuals and tests the group sample for the presence of the disease of interest instead of testing each individual sample. We examine whether group testing can be used to not only reduce the costs incurred with testing a large number of individuals but also improve the efficiency in estimating the underlying distribution function. We also seek to determine the optimal group size for nonparametric estimation of a distribution function, under various group testing scenarios. Regression models for the group testing approach are briefly considered.
We also describe current status data from the perspective of counting processes. We examine the relationship between current status data and simple counting processes. Specifically we consider the multistate model defined by two survival times of interest where one only observes whether or not each of the individual survival times exceed a common observed monitoring time. We are interested in estimation of the distribution function of time to the first event and whether current status information on the subsequent event can be used to improve this estimate. For both single and multiple monitoring time scenarios, in the fully nonparametric setting, one cannot improve the naive estimator, using information on the first event only, when estimating smooth functionals of the distribution of time to the first
event (van der Laan and Jewell (2003)). We therefore examine improving this naive estimator when parametric assumptions about the waiting time between the two events are made. For situations where this waiting time is modifiable by design, we also determine the optimal length of the waiting time for estimation of the cumulative hazard of the distribution of time to the first event in the recent past. The ideas are motivated by and applied to an example on simultaneous accurate and diluted HIV test data.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-