Frequent words tend to be short, and many researchers haveproposed that this relationship reflects a tendency towards ef-ficient communication. Recent work has sought to formalizethis observation in the context of information theory, which es-tablishes a limit on communicative efficiency called the chan-nel capacity. In this paper, I first show that the compositionalstructure of natural language prevents natural language com-munication from getting close to the channel capacity, but thata different limit, which incorporates probability in context,may be achievable. Next, I present two corpus studies in threetypologically-diverse languages that provide evidence that lan-guages change over time towards the achievable limit. Theseresults suggest that natural language optimizes for efficiencyover time, and does so in a way that is appopriate for compo-sitional codes.