There exist several approaches to robot locomotion, ranging from more traditional hand-designed trajectories and optimization-based methods, to more recent successes with deep reinforcement learning (DRL). The advantage of traditional model-based approaches lies in the ability to perform stability and robustness analyses on the generated motions and trajectories, but such methods can suffer from modeling mismatches or expensive computation times for real-time control. DRL on the other hand requires very little information to be known a priori to (eventually) find a solution, at the cost of high sample complexity (long training times) and no stability or robustness guarantees on the learned policies. This thesis seeks to bridge the gap between these two areas by injecting model-based ideas and methods into deep reinforcement learning towards efficient, robust, and explainable learned control policies.
The application of this strategy is in agile wheeled robot locomotion, for example to JPL’s RoboSimian and vehicles performing drift parking. A novel wheel model is presented, which when applied in an optimization framework, naturally results in trajectories that avoid or exploit wheel slipping depending on the task. These methods demonstrate the first planned and controlled drifting maneuvers for a wheel-legged system, with additional hardware results on a model vehicle.