Emotion and music are intrinsically connected, and researchers have had limited success in employing computationalmodels to predict perceived emotion in music. Here, we use computational dimension reduction techniques to discovermeaningful representations of music. For static emotion prediction, i.e., predicting one valence/arousal value for each 45smusical excerpt, we explore the use of triplet neural networks for discovering a representation that differentiates emotionsmore effectively. This reduced representation is then used in a classification model, which outperforms the original modeltrained on raw audio. For dynamic emotion prediction, i.e., predicting one valence/arousal value every 500ms, we examinehow meaningful representations can be learned through a variational autoencoder (a state-of-the-art architecture effectivein untangling information-rich structures in noisy signals). Although vastly reduced in dimensionality, our model achievesstate-of-the-art performance for emotion prediction accuracy. This approach enables us to identify which features underlieemotion content in music.