We propose a neural network model that accounts for the emer-gence of the taxonomic constraint in early word learning. Ourproposal is based on Mayor and Plunkett (2010)’s neurocom-putational model of the taxonomic constraint and overcomesone of its limitations, namely the fact that it considers arti-ficially built, simplified stimuli. In fact, while in the originalmodel the visual stimuli are random, sparse dot patterns, in ourproposed solution they are photographic images from the Im-ageNet database. In our model the represented objects in theimage can be of different size, color, location in the picture,point of view, etc.. We show that, notwithstanding the aug-mented complexity in the input, the proposed model comparesfavorably with respect to Mayor and Plunkett (2010)’s model.