Visual clutter affects our ability to see. Objects that would be identifiable on their own may become unrecognizable when presented close together ("crowding"), but the psychophysical characteristics of crowding have resisted simplification. Image properties initially thought to produce crowding have paradoxically yielded unexpected results; for example, adding flanking objects can ameliorate crowding (Manassi, Sayim, & Herzog, 2012; Herzog, Sayim, Chcherov, & Manassi, 2015; Pachai, Doerig, & Herzog, 2016). The resulting theory revisions have been sufficiently complex and specialized as to make it difficult to discern what principles may underlie the observed phenomena. Here, a generalized formulation of simple visual contrast energy is presented, arising from straightforward analyses of center and surround neurons in the early visual stream. Extant contrast measures, such as root mean square contrast, are easily shown to fall out as reduced special cases. The new generalized contrast energy metric surprisingly predicts the principal findings of a broad range of crowding studies. These early crowding phenomena may thus be said to arise predominantly from contrast or are, at least, severely confounded by contrast effects. Note that these findings may be distinct from accounts of other, likely downstream, "configural" or "semantic" instances of crowding, suggesting at least two separate forms of crowding that may resist unification. The new fundamental contrast energy formulation provides a candidate explanatory framework that addresses multiple psychophysical phenomena beyond crowding.