We describe a system that can associate images with Englishproverbs. We start from a corpus of proverbs, harvest relatedimages from the web and use this data to train two variants ofa convolutional neural network. We then collect a small set ofannotations, and use these to combine the outputs of the twonetworks into a single prediction for each input image. Wecarry out feature selection experiments on a set of features de-rived from the images and from the predicted proverbs, anddemonstrate that the metaphoricity of the proverbs plays a sig-nificant role in classification accuracy. An empirical evalua-tion with human raters confirms the system’s ability to abstractfrom the raw bits in the images and to learn meaningful, non-trivial associations.