- Main
Passing the Moral Turing Test
Abstract
The translation problem in moral AI asks how insights into human norms and values can be translated into a form suitablefor implementation in artificial systems. I argue that if my answer to a question about the human mind is right, thenthe translation problem is more tractable than previously thought. Specifically, I argue that we can use principles fromreinforcement learning to study human moral cognition, and that we can use principles from the resulting evaluative moralpsychology to design artificial systems capable of passing the Moral Turing Test (Allen, 2000). I illustrate the core featuresof my proposal by describing one such environment, or gridworld, in which an agent learns to trade-off between monetaryprofit and fair dealing, as characterized in behavioral economic paradigms. I conclude by highlighting the core technicaland philosophical advantages of such an approach for modeling moral cognition more broadly construed.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-