- Gaynor, Bruce D;
- Amza, Abdou;
- Gebresailassie, Sintayehu;
- Kadri, Boubacar;
- Nassirou, Baido;
- Stoller, Nicole E;
- Yu, Sun N;
- Cuddapah, Puja A;
- Keenan, Jeremy D;
- Lietman, Thomas M
We assessed trachoma grading agreement among field graders using photographs that included the complete spectrum of disease and compared it with cases where there was consensus among experienced graders. Trained photographers took photographs of children's conjunctiva during a clinical trial in Ethiopia. We calculated κ-agreement statistics using a complete set of 60 cases and then recalculated the κ using a consensus set where cases were limited to those cases with agreement among experienced graders. When the complete set of 60 cases was used, agreement was moderate (κ = 0.61, 95% confidence interval [95% CI] = 0.56-0.67). When the consensus set was used, agreement improved significantly (κ = 0.75, 95% CI = 0.68-0.80). The κ of the consensus set was higher than the complete set by 0.14 (95% CI = 0.12-0.16) (P < 0.001). If testing sets remove difficult-to-grade cases, agreement in trachoma grading may be higher than actually seen in population-based trachoma surveys.