In a compositional language the meaning of a sentence is a function of the meaning of its parts and the way they arecombined. Recent computational models of the emergence of compositionality have focused on the emergence of wordswhich encode sub-units of meaning in sub-units of form. Decidedly less attention has been paid to the emergence of rulesgoverning the combination of these words. Our work uses LSTM networks in an iterated learning set-up to provide anaccount of how some aspects of compositional structure may emerge through cumulative cultural evolution. We presenta novel metric for assessing the degree of positional structure present in an emergent model and use it to illustrate howcanonical word order may emerge naturally in LSTM models. This supports the notion that some elements of linguisticstructure result more from the dynamics of language transmission and use than domain-specific cognitive biases.