Skip to main content
eScholarship
Open Access Publications from the University of California

Modeling Social Learning Through Demonstration in Multi-Armed Bandits

Creative Commons 'BY' version 4.0 license
Abstract

Humans are efficient social learners who leverage social information to rapidly adapt to new environments, but the computations by which we combine social information with prior knowledge are poorly understood. We study social learning within the context of multi-armed bandits using a novel “asteroid mining” video game where participants learn through active play and passive observation of expert and novice players. We simulate human exploration and social learning using naive versions of Thompson and Upper Confidence Bound (UCB) solvers and hybrid models that use Thompson and UCB solvers for direct learning together with a multi-layer perceptron to estimate what should be learned from other players. Two variants of the hybrid models provide good, parameter-free fits to human performance across a range of learning conditions. Our work shows a route for integrating social learning into reinforcement learning models and suggests that human social learning conforms to the predictions of such models.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View