Computational Formulation, Modeling and Evaluation of Human-Robot Team Training Techniques
Title
Computational Formulation, Modeling and Evaluation of Human-Robot Team Training Techniques
Publication Type
Year of Publication
2014
Authors
Stefanos Z Nikolaidis
Academic Department
Department of Aeronautics and Astronautics
Degree
S.M.
Abstract
This thesis is focused on designing mechanisms for programming robots and training people to perform human-robot collaborative tasks, drawing upon insights from practices widely used in human teams. First, we design and evaluate human-robot cross-training, a strategy used and validated for effective human team training. Cross-training is an interactive planning method in which a human and a robot iteratively switch roles to learn a shared plan for a collaborative task. We present a computational formulation of the robot mental model, which encodes the sequence of robot actions towards task completion and the robot expectation over the preferred human actions, and show that it is quantitatively comparable to the human mental model that captures the interrole knowledge held by the human. Additionally, we propose a quantitative measure of human-robot mental model convergence, and an objective metric of mental model similarity. Based on this encoding, we formulate human-robot cross-training and evaluate it in human subject experiments (n = 36). We compare human-robot cross-training to standard reinforcement learning techniques, and show that cross-training provides statistically significant improvements in quantitative team performance measures. Additionally, significant differences emerge in the perceived robot performance and human trust. Finally, we discuss the objective measure of human-robot mental model convergence as a method to dynamically assess errors in human actions. This study supports the hypothesis that effective and fluent human-robot teaming may be best achieved by modeling effective practices for human teamwork. We also investigate the robustness of the learned policies to randomness in human behavior. We show that the learned policies are not robust to changes in the human behavior after the training phase. For this reason, we introduce a new framework that enables a robot to learn a robust policy to perform a collaborative task with a human.The human preference is modeled as a hidden variable in a Mixed Observability Markov Decision Process, which is inferred from joint-action demonstrations of a collaborative task. The framework automatically learns a user model from training data, and uses this model to plan an execution policy that is robust to changes in the human teammate’s behavior. We compare the effectiveness of the proposed framework to previous techniques that plan in state-space, using data from the human subject experiments in which human and robot teams trained together to perform a place-and-drill task. Results demonstrate the robustness of the learned policy to increasing deviations in human behavior.