Language understanding in the real world occurs through noise — often, lots of noise. What makes language understanding so robust? Here, we address this challenge with a
new approach. We cast language understanding as Bayesian inference in a generative model of how world states arise and project to utterances. We develop this model in a case study of action understanding from language input: inferring the goal of an agent in 2D grid worlds from utterances. The generative model provides a prior over agents' goals, a planner that maps these goals to actions, and a ‘language-renderer' that creates utterances from these actions. The generative model also incorporates GPT-2 as a noisy language production model. We invert this process with sequential Monte Carlo. In a behavioral experiment, the resulting model, called the Generative Semantic Transformation Process, explains evolving goal inferences of humans as utterances unfold.