Perturbation Training for Human-Robot Teams

TitlePerturbation Training for Human-Robot Teams
Publication TypeThesis
Year of Publication2015
AuthorsRamakrishnan, R.
Academic DepartmentDepartment of Electrical Engineering and Computer Science
DegreeS.M.
AbstractToday, robots are often deployed to work separately from people. Combining the strengths of humans and robots, however, can potentially lead to a stronger joint team. To have fluid human-robot collaboration, these teams must train to achieve high team performance and flexibility on new tasks. This requires a computational model that supports the human in learning and adapting to new situations. In this work, we design and evaluate a computational learning model that enables a human-robot team to co-develop joint strategies for performing novel tasks requiring coordination. The joint strategies are learned through "perturbation training," a human team-training strategy that requires practicing variations of a given task to help the team generalize to new variants of that task. Our Adaptive Perturbation Training (AdaPT) algorithm is a hybrid of transfer learning and reinforcement learning techniques and extends the Policy Reuse in Q-Learning (PRQL) algorithm to learn more quickly in new task variants. We empirically validate this advantage of AdaPT over PRQL through computational simulations. We then augment our algorithm AdaPT with a co-learning framework and a computational bi-directional communication protocol so that the robot can work with a person in live interactions. These three features constitute our human-robot perturbation training model. We conducted human subject experiments to show proof-of-concept that our model enables a robot to draw from its library of prior experiences in a way that leads to high team performance. We compare our algorithm with a standard reinforcement learning algorithm Q-learning and find that AdaPT-trained teams achieved significantly higher reward on novel test tasks than Q-learning teams. This indicates that the robot's algorithm, rather than just the human's experience of perturbations, is key to achieving high team performance. We also show that our algorithm does not sacrifice performance on the base task after training on perturbations. Finally, we demonstrate that human-robot training in a simulation environment using AdaPT produced effective team performance with an embodied robot partner.
URLhttp://hdl.handle.net/1721.1/99845