Please note: This PhD seminar will take place online.
Marvin Pafla, PhD candidate
David R. Cheriton School of Computer Science
Supervisors: Professors Kate Larson, Mark Hancock
With the rise of large language models (LLMs) like GPT, the field of eXplainable artificial intelligence (XAI) has exploded and produced a plethora of methods (e.g., saliency-maps) to gain insight into deep neural nets. However, human-participant studies question the efficacy of these methods, particularly when the AI output is wrong.
In this study, we collected and analyzed 156 human-generated text and saliency-based explanations collected in a question-answering task (N=40) and compared them empirically to state-of-the-art XAI explanations (integrated gradients, conservative LRP, and ChatGPT) in a human-participant study (N=136). Our findings show that participants found human saliency maps to be more helpful in explaining AI answers than machine saliency maps, but performance negatively correlated with trust in the AI model and explanations. This finding hints at a confirmation bias, where helpful explanations lead to lower task performance when the AI model is wrong.