Objective: Management of thyroid nodules with Bethesda Category III and IV cytology on fine needle aspiration (FNA) is challenging as they cannot be adequately classified as benign or malignant. Ultrasound (US) patterns have demonstrated utility in evaluating the risk of malignancy (ROM) of Bethesda Category III nodules. This study aims to evaluate the value of three well established US grading systems (ATA, Korean-TIRADS, and ACR-TIRADS) in determining ROM in Bethesda Category IV nodules. Methods: 92 patients with 92 surgically resected thyroid nodules who had Bethesda Category IV cytology on FNA were identified. Nodule images were retrospectively graded using the three systems in a blinded manner. Associations between US risk category and malignant pathology for each system were analyzed. Results: Of the 92 nodules, 56 (61%) were benign and 36 (39%) were malignant. 47% of ATA high risk nodules, 53% of K-TIRADS category 5 nodules, and 50% of ACR-TIRADS category 5 nodules were malignant. The ATA high-risk category had 25% sensitivity, 82% specificity, 47% PPV for malignancy. K-TIRADS category 5 had 25% sensitivity, 85% specificity, 53% PPV for malignancy. ACR-TIRADS category 5 had 25% sensitivity, 84% specificity, 50% PPV for malignancy. None of the three grading systems yielded statistically significant correlation between US risk category and the ROM (p =0.30, 0.72, 0.28). Conclusion: The ATA, Korean-TIRADS, and ACR-TIRADS classification systems are not helpful in stratifying ROM in patients with Bethesda Category IV nodules. Clinicians should be cautious of using ultrasound alone when deciding between therapeutic options for patients with Bethesda Category IV thyroid nodules.