Multimodal discourse requires an assembly of cognitive processes that are uniquely recruited for language comprehension in social contexts. In this study, we investigated the role of verbal working memory for the online integration of speech and iconic gestures. Participants memorized and rehearsed a series of auditorily presented digits in low (one digit) or high (four digits) memory load conditions. To observe how verbal working memory load impacts online discourse comprehension, ERPs were recorded while participants watched discourse videos containing either congruent or incongruent speech-gesture combinations during the maintenance portion of the memory task. While expected speech-gesture congruity effects were found in the low memory load condition, high memory load trials elicited enhanced frontal positivities that indicated a unique interaction between online speech-gesture integration and the availability of verbal working memory resources. This work contributes to an understanding of discourse comprehension by demonstrating that language processing in a multimodal context is subject to the relationship between cognitive resource availability and the degree of controlled processing required for task performance. We suggest that verbal working memory is less important for speech-gesture integration than it is for mediating speech processing under high task demands.