Frequency of reward and average reward value are two types
of reward information we utilize when making decisions
between two alternative options. Often, these two pieces of
information coincide with the highest value option, however,
when a slightly less valuable option is presented more
frequently, standard reinforcement learning models such as the
Delta model can make incorrect predictions. This paper
explores the discrepancy in these predictions by way of
simulating relevant behavioral tasks with the Delta model, the
Decay model, and a novel Bayesian model based on the
Dirichlet distribution. We then compare model predictions to
behavioral data from some of the same tasks that were
simulated. The Delta model provides a poor fit to the data for
each of the three presented tasks when compared to the Decay
model and the two Bayesian learning models, because it
predicts a bias toward options with higher average reward,
while the Decay and Bayesian models predict a bias toward
reward frequency. The Decay and Bayesian models show a
distinct similarity in prediction and fits to the data for most of
the tasks. This is because both models predict a bias toward
reward frequency rather than average reward magnitude,
despite different computational formalisms. However, we also
note some interesting discrepancies between the Decay and
Bayesian models which will show that in some cases, the
frequency of reward may be more important than the reward
value.