Stojic, Hrvoje; Analytis, Pantelis P; Speekenbrink, Maarten

Human behavior in contextual multi-armed bandit problems

2015

Abstract

In real-life decision environments people learn from their direct experience with alternative courses of action. Yet they can accelerate their learning by using functional knowledge about the features characterizing the alternatives. We designed a novel contextual multi-armed bandit task where decision makers chose repeatedly between multiple alternatives characterized by two informative features. We compared human behavior in this contextual task with a classic multi-armed bandit task without feature information. Behavioral analysis showed that participants in the contextual bandit task used the feature information to direct their exploration of promising alternatives. Ex post, we tested participants‚Äô acquired functional knowledge in one-shot multi-feature choice trilemmas. We compared a novel function-learning-based reinforcement learning model to a classic reinforcement learning. Although reinforcement learning models predicted behavior better in the learning phase, the new models did better in predicting the trilemma choices

Main Content

For improved accessibility of PDF content, download the file to your device.

Proceedings of the Annual Meeting of the Cognitive Science Society

Human behavior in contextual multi-armed bandit problems