Human adaptive decision-making recruits multiple cognitive processes for learning stimulus-action (SA) associations. These proceses include reinforcement learning (RL), which represents gradual estimation of values of choices relevant for future reward-driven decisions, episodic memory (EM), which stores precise event information for long-term retrieval, and working memory (WM), which serves as flexible but temporary, capacity-limited storage. However, we have limited understanding of how these systems work together. Here, we introduce a new one-shot RL task to disentangle their respective roles. In 16 independent 8-trial blocks, 144 participants used one-shot rewards to learn 4 new SA associations per block. Each block provided one chance to obtain feedback for pressing one of two keys for each stimulus (trials 1--4), followed by a chance to use this feedback to make a choice in a short-term association task (trials 5--8; no feedback), primarily targeting WM. In a subsequent testing phase designed to assess long-term retention through RL or EM, all 64 stimuli were shown in randomized order and subjects were asked to press the correct key for each, without feedback. Trials 5--8 revealed WM-dependent strategy effects on choice accuracy, as well as a role for both RL and EM when WM is overwhelmed. Testing phase accuracy depended on feedback interacting with initial presentation order, revealing signatures of both RL and EM in learning from one-shot rewards. Computational modeling suggests that a mixture model combining RL and EM components best fits group-level testing phase behavior. Our results show that our new protocol can identify signatures of each of the three memory systems' contributions to reward-based learning. With this approach, we create new possibilities to better understand how each integrates a single bit of information, what their exact contributions to choice are, and how they interact.