From the noisy information bombarding our senses, our brains must construct percepts that are veridical – reflecting the true state of the world – and informative – conveying what we did not already know. Influential theories suggest that both challenges are met through mechanisms that use expectations about the likely state of the world to shape perception. Predictive coding, a theoretical framework in which the brain compares a generative model to incoming sensory signals, seeks to explain this inferential process of perception. Implementation of predictive coding at a mechanistic level in individual neurons is however not well understood. This dissertation builds on the field of computational modeling of sensory systems to further our understanding of perceptual inference in individual neurons in the auditory system of songbirds (European starling). Chapter 1 summarizes some possible ways in which predictive coding could be realized at the algorithmic level. The main contribution of this chapter is not to present models that can uniquely account for sensory data, but to reconcile different cognitive behaviors under the umbrella of predictive coding which were previously considered incompatible. Chapter 2 combines computational modeling with electrophysiological activity to study internal representations of sensory information under predictive coding framework in single auditory neurons in songbirds. This study examines unique components of the response variance guided by predictive coding. It is shown that during song listening, individual neuron responses in primary and secondary auditory regions are modeled by expectations of future song and uncertainty of song. Generative machine learning approaches enable us to make hypotheses about internal generative models for complex natural stimuli. Chapter 3 explores the flexibility of the internal model and whether prediction error is correlated with behavior. This study draws primarily upon the complex structure of birdsong, developing methods to analyze the acoustic and temporal structure in vocal signals and then behaviorally and physiologically probing the underpinnings of the predictive structure of birdsong. The findings in this chapter indicate that prediction error encoding is modulated by task-relevant generative models and behavior. Taken together, my dissertation work offers a unified explanation of inferential frameworks of perception in contexts of systems processing behaviorally relevant signals, showcasing its adaptation to these systems.