Seminar • Data Systems — Meta-Analysis for Retrieval Experiments Involving Multiple Test Collections

Tuesday, April 9, 2019 10:30 am - 10:30 am EDT (GMT -04:00)

Ian Soboroff, Leader, Retrieval Group
National Institute of Standards and Technology

Traditional practice recommends that information retrieval experiments be run over multiple test collections, to support, if not prove, that gains in performance are likely to generalize to other collections or tasks. However, because of the pooling assumptions, evaluation scores are not directly comparable across different test collections. 

We present a widely-used statistical tool, \em meta-analysis, as a framework for reporting results from IR experiments using multiple test collections. We demonstrate the meta-analytical approach through two standard experiments on stemming and pseudo-relevance feedback, and compare the results to those obtained from score standardization. Meta-analysis incorporates several recent recommendations in the literature, including score standardization, reporting effect sizes rather than score differences, and avoiding a reliance on null-hypothesis statistical testing, in a unified approach. It therefore represents an important methodological improvement over using these techniques in isolation.


Bio: Dr. Ian Soboroff is a computer scientist and leader of the Retrieval Group at the National Institute of Standards and Technology (NIST). The Retrieval Group organizes the Text REtrieval Conference (TREC), the Text Analysis Conference (TAC), and the TREC Video Retrieval Evaluation (TRECVID). These are all large, community-based research workshops that drive the state-of-the-art in information retrieval, video search, web search, information extraction, text summarization and other areas of information access. 

He has co-authored many publications in information retrieval evaluation, test collection building, text filtering, collaborative filtering, and intelligent software agents. His current research interests include building test collections for social media environments and nontraditional retrieval tasks.