We consider the problem of predicting how humans learn inter-actively in an adversarial Multi-Armed Bandit (MAB) setting.In a cybersecurity scenario, we designed defense algorithms toassign decoys to lure attackers. Humans play the role of cyberattackers in an experiment to try to learn the defense strategyafter repeated interactions. Participants played against one ofthree defense algorithms: a stationary strategy, a static game-theoretic solution, and an adaptive MAB strategy. Our resultsshow that humans have the most difficulty learning against theadaptive defense. We also evaluated five different models ofattack behavior and compared their predictions against humandata. We show that a modified version of Thompson Samplingand a cognitive model based on Instance-Based Learning The-ory are the best at replicating human learning against defensestrategies. We discuss how these models of human attacker caninform future cyberdefense tools.