Determining the location of objects relative to ourselves is essential for interacting with the world. Neural activity in the retina is used to form a vision-independent model of the local spatial environment relative to the body. For example, when an animal navigates through a forest, it rapidly shifts its gaze to identify the position of important objects, such as a tree obstructing its path. This seemingly trivial behavior belies a sophisticated neural computation. Visual information entering the brain in a retinocentric reference frame must be transformed into an egocentric reference frame to guide motor planning and action. This, in turn, allows the animal to extract the location of the tree and plan a path around it. In this review, we explore the anatomical, physiological, and computational implementation of retinocentric-to-egocentric reference frame transformations - a research area undergoing rapid progress stimulated by an ever-expanding molecular, physiological, and computational toolbox for probing neural circuits. We begin by summarizing evidence for retinocentric and egocentric reference frames in the brains of diverse organisms, from insects to primates. Next, we cover how distance estimation contributes to creating a three-dimensional representation of local space. We then review proposed implementations of reference frame transformations across different biological and artificial neural networks. Finally, we discuss how an internal egocentric model of the environment is maintained independently of the sensory inputs from which it is derived. By comparing findings across a variety of nervous systems and behaviors, we aim to inspire new avenues for investigating the neural basis of reference frame transformation, a canonical computation critical for modeling the external environment and guiding goal-directed behavior.