Compositionality is considered a key hallmark of human
language. However, most research focuses on item-level compositionality, e.g.,
to what extent the meanings of phrases are composed of the meanings of their
sub-parts, rather than on language-level compositionality, which is the degree
to which possible combinations are utilized in practice during language use.
Here, we propose a novel way to quantify the degree of language-level
compositionality and apply it in the case of English adjective-noun
combinations. Using corpus analyses, large language models, and human
acceptability ratings, we find that (1) English only sparsely utilizes the
compositional potential of adjective–noun combinations; and (2) LLMs struggle to
predict human acceptability judgments of rare combinations. Taken together, our
findings shed new light on the role of compositionality in language and
highlight a challenging area for further improving LLMs.