PhD Defence • Information Retrieval • Evaluation of Information Access Systems in the Generative Era

Wednesday, May 28, 2025 2:00 pm - 5:00 pm EDT (GMT -04:00)

Please note: This PhD defence will take place online.

Negar Arabzadehghahyazi, PhD candidate
David R. Cheriton School of Computer Science

Supervisor: Professor Charles Clarke

The rapid development of neural retrieval models and generative information-seeking systems has outpaced traditional evaluation methods, revealing critical gaps—especially when relevance labels are sparse. Current frameworks often fail to fairly compare retrieval and generation-based systems. Large language models (LLMs) further challenge conventional evaluation while offering new possibilities for automation.

This thesis first shows that sparse labeling introduces bias, underestimating strong models that retrieve relevant but unjudged documents. To address this, we propose a new evaluation method using Fréchet Distance, which improves robustness and enables comparison between retrieval and generative systems. We then explore the use of LLMs for evaluation, focusing on automated relevance judgments. We compare LLM-based methods, expose the lack of standardization, and propose a framework to assess these approaches based on alignment with human labels and impact on system rankings. We also highlight how prompt variations affect LLM evaluation consistency. Finally, we extend our analysis to evaluating generated content across tasks, including retrieval-assisted methods for text generation, IR-inspired evaluation for text-to-image models, and a broader framework for assessing LLM-powered applications. Together, these contributions advance evaluation methods for modern information access systems.


Attend this PhD defence virtually on Zoom.