Human-Machine Collaborative Optimization via Apprenticeship Scheduling

Title	Human-Machine Collaborative Optimization via Apprenticeship Scheduling
Publication Type	Thesis
Year of Publication	2017
Authors	Gombolay, M. C.
Academic Department	Department of Aeronautics and Astronautics
Degree	Ph. D.
Abstract	I envision a future where intelligent service robots become integral members of human-robot teams in the workplace. Today, service robots are being deployed across a wide range of settings; however, while these robots exhibit basic navigational abilities, they lack the ability to anticipate and adapt to the needs of their human teammates. I believe robots must be capable of autonomously learning from humans how to integrate into a team ' la a human apprentice. Human domain experts and professionals become experts over years of apprenticeship, and this knowledge is not easily codified in the form of a policy. In my thesis, I develop a novel computational technique, Collaborative Optimization Via Apprenticeship Scheduling (COVAS), that enables robots to learn a policy to capture an expert's knowledge by observing the expert solve scheduling problems. COVAS can then leverage the policy to guide branch-and-bound search to provide globally optimal solutions faster than state-of-the-art optimization techniques. Developing an apprenticeship learning technique for scheduling is challenging because of the complexities of modeling and solving scheduling problems. Previously, researchers have sought to develop techniques to learn from human demonstration; however, these approaches have rarely been applied to scheduling because of the large number of states required to encode the possible permutations of the problem and relevant problem features (e.g., a job's deadlines, required resources, etc.). My thesis gives robots a novel ability to serve as teammates that can learn from and contribute to coordinating a human-robot team. The key to COVAS' ability to efficiently and optimally solve scheduling problems is the use of a novel policy-learning approach - apprenticeship scheduling - suited for imitating the method an expert uses to generate the schedule. This policy learning technique uses pairwise comparisons between the action taken by a human expert (e.g., schedule agent a to complete task [tau]i at time t) and each action not taken (e.g., unscheduled tasks at time t), at each moment in time, to learn the relevant model parameters and scheduling policies demonstrated in training examples provided by the human experts. I evaluate my technique in two real-world domains. First, I apply apprenticeship scheduling to the problem of anti-ship missile defense: protecting a naval vessel from an enemy attack by deploying decoys and countermeasures at the right place and time. I show that apprenticeship scheduling can learn to defend the ship, outperforming human experts on the majority of naval engagements (p < 0.011). Further, COVAS is able to produce globally optimal solutions an order of magnitude faster than traditional, state-of-the-art optimization techniques. Second, I apply apprenticeship scheduling to learn how to function as a resource nurse: the nurse in charge of ensuring the right patient is in the right type of room at the right time and that the right types of nurses are there to care for the patient. After training an apprentice scheduler on demonstrations given by resource nurses, I found that nurses and physicians agreed with the algorithm's advice 90% of the time. Next, I conducted a series of human-subject experiments to understand the human factors consequences of embedding scheduling algorithms in robotic platforms. Through these experiments, I found that an embodied platform (i.e., a physical robot) engenders more appropriate trust and reliance in the system than an un-embodied one (i.e., computer-based system) when the scheduling algorithm works with human domain experts. However, I also found that increasing robot autonomy degrades human situational awareness. Further, there is a complex interplay between workload and workflow preferences that must be balanced to maximize team fluency. Based on these findings, I develop design guidelines for integrating service robots with autonomous decision-making capabilities into the human workplace.
URL	http://hdl.handle.net/1721.1/112453