Introduction

by Jimmy Lin (June 11, 2005)

Aranea is a Web-based factoid question answering system that uses a combination of data redundancy and database techniques. Its performance in TREC 2002, TREC 2003, and TREC 2004 was competitive. The predecessor to Aranea is the askMSR system that colleagues at Microsoft Research and I developed in 2001. The following paper provides an overview of the Aranea system:

Jimmy Lin and Boris Katz. Question Answering from the Web Using Knowledge Annotation and Knowledge Mining Techniques. Proceedings of Twelfth International Conference on Information and Knowledge Management (CIKM 2003), 2003.

As I am no longer actively working on Web-based factoid question answering, I am releasing Aranea under an open source license. I would be interested in hearing feedback about the system. If you publish results based on Aranea, please cite the above paper. Also, please understand that this package is released as is---I do not intend to support it. I will try my best to help people out, but please do not get upset if you email me with a technical support question and get no response.

Additional relevant papers are found below:

Eric Brill, Jimmy Lin, Michele Banko, Susan Dumais, and Andrew Ng. Data-Intensive Question Answering. Proceedings of the Tenth Text REtrieval Conference (TREC 2001), November 2001, Gaithersburg, Maryland.

Jimmy Lin, Aaron Fernandes, Boris Katz, Gregory Marton, and Stefanie Tellex. Extracting Answers from the Web Using Knowledge Annotation and Knowledge Mining Techniques. Proceedings of the Eleventh Text REtrieval Conference (TREC 2002), November 2002, Gaithersburg, Maryland.

Boris Katz, Jimmy Lin, Daniel Loreto, Wesley Hildebrandt, Matthew Bilotti, Sue Felshin, Aaron Fernandes, Gregory Marton, and Federico Mora. Integrating Web-based and Corpus-based Techniques for Question Answering. Proceedings of the Twelfth Text REtrieval Conference (TREC 2003), November 2003, Gaithersburg, Maryland.

Boris Katz, Matthew Bilotti, Sue Felshin, Aaron Fernandes, Wesley Hildebrandt, Roni Katzir, Jimmy Lin, Daniel Loreto, Gregory Marton, Federico Mora, Ozlem Uzuner. Answering Multiple Questions On a Topic From Heterogeneous Resources. Proceedings of the Thirteenth Text REtrieval Conference (TREC 2004), November 2004, Gaithersburg, Maryland.

Distribution Information

The Aranea question answering system
Copyright (C) 2002-2005, Jimmy Lin

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See the GNU General Public License for more details.

Please refer to license.txt for more details.