When you see a glass fall off the table, you can predict it will break without seeing it hit the ground. Similarly, if you hear the glass shatter, you can infer what happened without seeing anything at all. Our knowledge of the causal mechanisms structuring our world allows us to make impressively accurate inferences based on incomplete information spread across multiple sensory modalities. In this work, we study the cognitive processing that supports this remarkable behavior. We utilize the Plinko domain, an intuitive physics setup where marbles are dropped into a box from one of three holes, colliding with obstacles as they fall to the ground. Participants judge where they think the ball fell from based on visual and auditory evidence. We track participants' eye-gaze to gain deeper insight into their mental processes, and develop models that characterize the computational processes underlying participant behavior.