Tracking and surveillance of the HIV epidemic depend on accurate estimation of the number of new infections in the population. The rate at which these infections occur, known as the incidence, is also critical for effectively designing, targeting, and evaluating prevention efforts. Incidence can be estimated through cross-sectional surveys by using biomarkers, such as HIV viral load, CD4 cell count, and recently developed serologic assays, which define and mark people in an early disease stage. The total number of individuals found in this stage, that is possessing markers of recent infection, gives a snapshot of how the epidemic is progressing.
We explore how these biomarkers should best be combined to define this early disease stage by examining how the definition influences the bias and variability of the cross-sectional incidence estimator. These calculations depend on estimating the probability that persons will remain in the early disease stage $t$ years after seroconversion. We present two different approaches for estimating this probability curve. Once we have defined viable methods for combining these biomarkers we derive the sample sizes needed to conduct one or more cross-sectional surveys and explore how missing biomarker data should be handled in the context of implementing these surveys.