Getting Aranea up and running...

  1. Type make at the command line to build Aranea dependencies.
  2. Uncompress etc/test_questions_cache.tar.gz to /tmp. You should have lots of files that look like URLs in /tmp/web_cache. Please read this important note on Web caching!
  3. Issue the following command:

    
    answerit.pl etc/basic.teoma_and_google.nolookup.modules etc/test.questions
    

    You should get something like the following on stdout:

    Starting Aranea...
    modules list = 
    question file = 
         
    Question 221: Who killed Martin Luther King?
    221 Q0 Doc 1 4769.90783579419 aranea James Earl Ray
    ...
         
    Question 230: When did the vesuvius last erupt?
    230 Q0 Doc 1 1572.80674011279 aranea 1944 AD
    ...
         
    Question 263: When was Babe Ruth born?
    263 Q0 Doc 1 8196.23274081849 aranea February 6, 1895
    ...
    

  4. Congratulations! Aranea seems to be up and running

NOTE: If you get an UTF-8 error, try setting the LANG environmental varible from en_US.UTF-8 to en_US

AnswerIt!

The script answerit.pl is used to answer factoid questions:


usage: ./answerit.pl [modules-file] [questions-file]

The modules-files specifies which Aranea modules to invoke, and in what order; see etc/*.modules. The names are self-explanatory: teoma uses Teoma, google uses Google, and lookup indicates that for certain classes of questions, database lookup is employed (e.g., "What is the population of X?"). Currently, the best performing combination of modules is the sequence indicated in etc/basic.teoma-and-google.lookup.modules.

The questions-file contains the factoid questions to be answered, one per line. Each line contains the question number qno followed by a tab and then the natural language question. See etc/test.questions for a sample questions file.