============================================ Pourpre scoring script for complex questions ============================================ by Dina Demner-Fushman and Jimmy Lin Version 1.1, released 06/13/2007 Version 1.0, released 05/29/2005 For a description of what Pourpre actually does, please refer to: Jimmy Lin and Dina Demner-Fushman. Methods for Automatically Evaluating Answers to Complex Questions. Information Retrieval, 9(5):565-587, 2006. [DOI: 10.1007/s10791-006-9003-7] Jimmy Lin and Dina Demner-Fushman. Automatically Evaluating Answers to Definition Questions. Proceedings of the 2005 Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP 2005), pages 931-938, October 2005, Vancouver, Canada. If you find this package helpful, please feel free to share your experiences with us. ------------------------------------------------------------------------------- Scripts included in this package: pourpre-1.1c.pl: generates scores based on simple term counts pourpre-1.1i.pl: generates scores based of term idf values use the -h option to display full usage info compute-kendall-tau.pl: computes Kendall's tau ------------------------------------------------------------------------------- Sample invocation: $ ./pourpre-1.1c.pl -s samplerun.trec2003.bbn2003c -r nuggets.trec2003 BBN2003C 0.40932583708841 $ ./pourpre-1.1c.pl -s samplerun.trec2004.run12 -r nuggets.trec2004 RUN-12 0.437531652867849 ------------------------------------------------------------------------------- Current Peformance: Kendall's tau correlation on standard testsets (using default settings) pourpre-c pourpre-i TREC 2003 0.889 0.876 TREC 2004 0.828 0.812 TREC 2005 0.725 0.692 NOTE: Correlation against human-assessed F(Beta=3) scores in all cases, even though TREC 2003 used Beta=5 as official score. ------------------------------------------------------------------------------- Distribution Information Pourpre scoring script for complex questions Copyright (C) 2005-2007, Dina Demner-Fushman and Jimmy Lin This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, please visit http://www.gnu.org/copyleft/gpl.html See licesnse.txt for full details ------------------------------------------------------------------------------- Version 1.1, released 07/13/2007 - bug fix: in pourpre-c script, inadvertently set count mode to binary, which was purely for development purposes only. Binary count mode leads to lower performance. - bug fix: in pourpre-c script; recall = r/R, previously R was the count of all nuggets (should be count of vital nuggets only). Thanks to Adrian Novischi of LCC for catching this! Version 1.0, released 05/29/2005 - Initial release