Resources
This page is no longer being maintained. If you're looking for a specific resource, my publications or my GitHub profile would be good starting points.
Raw nugget pyramids data
Released: April 13, 2006 (Last update: September 9, 2006)
Jimmy Lin and Dina Demner-Fushman. Will Pyramids Built of Nuggets Topple Over? Proceedings of the 2006 Human Language Technology Conference and the North American Chapter of the Association for Computational Linguistics Annual Meeting (HLT/NAACL 2006), page 383-390, June 2006, New York City, New York.
- Download:
nugget-pyramids.tar.gz
(646k) - Download:
combine_judgments.pl
(Perl script for building nugget pyramids)
Pourpre scoring script for automatically evaluating complex questions
Released: May 29, 2005 (Last update: June 13, 2007)
Jimmy Lin and Dina Demner-Fushman. Methods for Automatically Evaluating Answers to Complex Questions. Information Retrieval, 9(5):565-587, 2006. [DOI:10.1007/s10791-006-9003-7]
Jimmy Lin and Dina Demner-Fushman. Automatically Evaluating Answers to Definition Questions. Proceedings of the 2005 Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP 2005), pages 931-938, October 2005, Vancouver, Canada.
- Download:
pourpre-1.1.tar.gz
: Release 06/13/2007 (404k) [README] - Older version:
pourpre-1.0.tar.gz
: Release 05/29/2005 (376k) [README]
The Aranea question answering system
Released: June 11, 2005
Aranea is a Web-based factoid question answering system that uses a combination of data redundancy and database techniques. Its performance in TREC 2002, TREC 2003, and TREC 2004 was competitive. The predecessor to Aranea is the askMSR system that colleagues at Microsoft Research and I developed in 2001.
Jimmy Lin. An Exploration of the Principles Underlying Redundancy-Based Factoid Question Answering. ACM Transactions on Information Systems, 27(2):1-55, 2007.
- Download:
Aranea-r1.00.tar.gz
(52221k)
QA test collection
Released: June 9, 2005
The question answering test collection as descibed in: Jimmy Lin and Boris Katz. Building a Reusable Test Collection for Question Answering. Journal of the American Society for Information Science and Technology, 57(7):851-861, 2006.
- Download:
qa-test-collection.tar.gz
(32k)
Java version of Brill's Part-of-Speech Tagger
Released: December 27, 2004
Eric Brill's part-of-speech tagger ported to Java via the Java Native Interface (JNI). In actuality, it's based on Benjamin Han's ePost package, which is a cleaned-up version of Brill's original code. Has been tested on both Linux and Windows (under Cygwin).
- Documentation: javadoc
- Download: brill-java-1.0.tar.gz (9352 KB)
LPost: Perl version of Brill's Part-of-Speech Tagger
Released: December 27, 2004
Eric Brill's part-of-speech tagger as a Perl Module. Just like the Java version, it's based on Benjamin Han's ePost package. Has been tested on both Linux and Windows (under Cygwin with ActiveState Perl).
- Documentation: LPost POD
- Download: LPost-1.0.tar.gz (593 KB)