The log polar transformation is a key part of the mapping from the visual field to the primary visual cortex.This transformation has various properties that enhance performance on visual tasks, including a high concentration of information at the point of fixation. When combined with other components of the visual system, it enables humans to learn from limited labeled data.
Humans excel at this style of learning called few-shot learning, as they can accurately classify similar objects after exposure to only a few examples.
Self-supervised learning has emerged as a promising paradigm to address the challenge of few-shot learning with vision models. It allows models to use unlabeled data for pretraining to learn meaningful internal representations. These representations can be applied to downstream tasks with less available labeled data.
In this paper, we investigate the potential of incorporating the log polar transform into self-supervised learning. Our initial findings reveal that the log polar transform may not be suitable for self-supervised learning on its own. However, the use of the log polar transform allows for a higher out-of-distribution generalization performance across multiple datasets and models.