Natural Language systems have evolved tremendously in the past few years from dealing only with small handcrafted examples to extremely large, real-world applications. This course is intended for anyone interested in gaining an in-depth understanding of current theory and algorithms in Computational Linguistics, and real-world applications in Natural Language computing.
Participants will cover a wide variety of topics in a fair amount of depth. There are no formal requirements other than interest in the topic and ability to read and analyze technical material. Programming experience will be helpful for the course project, but is not essential. Programmers and non-programmers will both be needed as members of the teams for the course projects. Some knowledge of linguistics or a second language would be helpful but is not necessary.
Grading will be based on weekly one-page position papers based on the readings (10%); participation in discussions (10%); a short presentation (5%), a term paper on a topic of your choice (40%); and participation in a team project (35%).
Position papers will address questions related to linguistic
theory, computational linguistics methods and algorithms,
and applications.
Both non-programmers and non-linguists are encouraged to take the course. Project teams will be made up of team members who have complementary skills: those with either linguistic background or programming skills will be assigned to tasks suited to their skills, and graded accordingly.
Auditors are welcome but will be expected to write at least one position paper and to participate in discussions.
If you are interested in this course and would like to do some background preparation on your own, Chapter 1 in the following textbook is recommended:
Daniel Jurafsky and James H. Martin
Speech and language processing: An introduction to natural
language processing,
computational linguistics and speech
recognition (second edition)
Prentice Hall, 2008
Monday May 6 1:30-3:30 DC2306C
Jurafsky and Martin, Chapter 1
Monday May 13 1:30-3:30 DC2306C
Jurafsky and Martin,
2 (Regular Expressions and Automata),
4 (N-grams),
5 (Part-of-Speech Tagging)
Monday May 27 1:30-3:30 DC2306C
Jurafsky and Martin, 12 (Context Free Grammars)
Monday June 3 1:30-3:30 DC2306C
Jurafsky and Martin, 13 (Syntactic Parsing), 14 (Statistical Parsing),
Monday June 10 1:30-3:00 DC2306C
Team Project out: Automated Text Summarizer Jurafsky and Martin,
23 (Question Answering and Summarization)
Eduard Hovy,
"Text summarization",
The Oxford handbook of computational linguistics,
Oxford University Press, 2005.
How to Build a Text Summarizer Tutorial
Text Summarizer Starter Code (zip file)
Monday June 17
Monday June 17 and Monday June 24
Jurafsky and Martin,
15 (Features and Unification),
17 (The Representation of Meaning),
18 (Computational Semantics)
19 (Lexical Semantics),
20 (Computational Lexical Semantics)
Monday July 8
Jurafsky and Martin, 21 (Computational Discourse)
Monday July 15
Jurafsky and Martin, 22 (Information Extraction)
Monday July 22 1:30-3:30 DC2306C
TBA
Monday July 29