"The Perils of Trial-and-Error Reward Design: Misdesign through Overfitting and Invalid Task Specifications", Proceedings of the 37th AAAI Conference on Artificial Intelligence (AAAI), Washington, D.C. , 02/2023.
"Do Feature Attribution Methods Correctly Attribute Features?", Proceedings of the 36th AAAI Conference on Artificial Intelligence: AAAI, 02/2022.
"The Irrationality of Neural Rationale Models", Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2nd Workshop on Trustworthy Natural Langauge Processing (TrustNLP), 07/2022.
"Revisiting Human-Robot Teaching and Learning Through the Lens of Human Concept Learning Theory", ACM/IEEE International Conference on Human-Robot Interaction (HRI), 03/2022.
"How to Understand Your Robot: A Design Space Informed by Human Concept Learning", International Conference on Robotics and Automation (ICRA), Workshop on Social Intelligence in Humans and Robots (SIHR), 05/2021.
"RoCUS: Robot Controller Understanding via Sampling", Conference on Robot Learning (CoRL), London, UK, Proceedings of Machine Learning Research, 11/2021.
"Modeling Blackbox Agent Behaviour via Knowledge Compilation", AAAI Conference on Artificial Intelligence, Workshop on Plan, Activity, and Intent Recognition (PAIR), 2020.
"Evaluating the Interpretability of the Knowledge Compilation Map: Communicating Logical Statements Effectively", International Joint Conference on Artificial Intelligence (IJCAI), Macau, China, 08/2019.