This dissertation is composed of a study of estimation methods in classical and test theories and the elaboration and application of a cluster-robust variance estimator. Variance estimators derived from generalized estimating equations are known to be robust to most covariance structures and are therefore well suited for psychometric analysis of longitudinal test data. However, the approximate normal distribution of the test statistic for clustered binary experiments breaks down when the variation between cluster variances is large. The degrees of freedom for the test statistic are smaller than the number of clusters in unbalanced experiments and closer to an effective number of clusters, G*, which we estimate as the degrees of freedom using Satterthwaite approximation. We calculate a bias bound as a function of G* to improve the coverage percentages of the test statistic. Simulations generated by a beta-binomial model and a Markov chain model show that the bias-adjusted cluster-robust variance estimator improves the test statistic and achieves a coverage percentage of at least 94\% for highly heteroskedastic settings. For conservative confidence intervals in even more unbalanced situations, t-scores with G* degrees of freedom can be used. When compared to a quasibinomial generalized linear model and a wild bootstrap estimator, the bias-adjusted CRVE is closer to the asymptotic distribution for low effective numbers of clusters and yields almost equivalent results to the other two estimators across simulations. Consistency conditions based on cluster heterogeneity are shown to be sufficient for convergence of a chi-square test for testing multiple probabilities across each cluster. We show that the chi-square statistic can be used to test for parallel scales, equivalent items, or time effects in classical test analysis of longitudinal data. Finally, we discuss the use of generalized estimating equations and the multivariate cluster-robust variance estimator in Rasch analysis of repeated measures.