Skip to the content of the web site.

CS784 | Computational Linguistics Spring 2013

Organizational Meeting:
Monday May 6 1:30-3:30
DC2306C (AI Lab Conference Room)
Instructor: Chrysanne Di Marco, email: cdimarco AT uwaterloo.ca


All sessions will be held in the AI Lab conference room (DC2306C).
Copies of texts will be available for short-term loan at the DC Library Circulation Desk.


Course Overview

Natural Language systems have evolved tremendously in the past few years from dealing only with small handcrafted examples to extremely large, real-world applications. This course is intended for anyone interested in gaining an in-depth understanding of current theory and algorithms in Computational Linguistics, and real-world applications in Natural Language computing.

Participants will cover a wide variety of topics in a fair amount of depth. There are no formal requirements other than interest in the topic and ability to read and analyze technical material. Programming experience will be helpful for the course project, but is not essential. Programmers and non-programmers will both be needed as members of the teams for the course projects. Some knowledge of linguistics or a second language would be helpful but is not necessary.

Grading will be based on weekly one-page position papers based on the readings (10%); participation in discussions (10%); a short presentation (5%), a term paper on a topic of your choice (40%); and participation in a team project (35%).

Position papers will address questions related to linguistic theory, computational linguistics methods and algorithms, and applications.

Both non-programmers and non-linguists are encouraged to take the course. Project teams will be made up of team members who have complementary skills: those with either linguistic background or programming skills will be assigned to tasks suited to their skills, and graded accordingly.

Auditors are welcome but will be expected to write at least one position paper and to participate in discussions.

If you are interested in this course and would like to do some background preparation on your own, Chapter 1 in the following textbook is recommended:

Daniel Jurafsky and James H. Martin
Speech and language processing: An introduction to natural language processing,
computational linguistics and speech recognition (second edition)

Prentice Hall, 2008

DC Library Short-term loan call number: UWD 1440

 


Course Outline


SESSIONS 1 to 4 Information Design

Session 1: Course Organizational Meeting/Who is "Watson"?

Monday May 6 1:30-3:30 DC2306C

Readings:

Jurafsky and Martin, Chapter 1

Reading available from instructor (send email)



Session 2: Syntax and Parsing I

Monday May 13 1:30-3:30 DC2306C

Readings:

Jurafsky and Martin, 2 (Regular Expressions and Automata), 4 (N-grams), 5 (Part-of-Speech Tagging)

DC Library Short-term loan call number: UWD 1440



NO Lecture Monday May 20 Victoria Day


Session 3: Syntax and Parsing II

Monday May 27 1:30-3:30 DC2306C

Readings:

Jurafsky and Martin, 12 (Context Free Grammars)

DC Library Short-term loan call number: UWD 1440



Session 4: Syntax and Parsing III

Monday June 3 1:30-3:30 DC2306C

Readings:

Jurafsky and Martin, 13 (Syntactic Parsing), 14 (Statistical Parsing),

DC Library Short-term loan call number: UWD 1440



SESSIONS 5 to 8 Information Mining

Session 5: Text Summarization

Monday June 10 1:30-3:00 DC2306C

Team Project out: Automated Text Summarizer

Readings:

Jurafsky and Martin, 23 (Question Answering and Summarization)

Eduard Hovy,
"Text summarization",
The Oxford handbook of computational linguistics,
Oxford University Press, 2005.

PDF

How to Build a Text Summarizer Tutorial

Text Summarizer Starter Code (zip file)


Session 6: Bonus Project: Narratology Machine

Monday June 17


Sessions 7a and 7b: Semantics

Monday June 17 and Monday June 24

Readings:

Jurafsky and Martin, 15 (Features and Unification), 17 (The Representation of Meaning), 18 (Computational Semantics) 19 (Lexical Semantics), 20 (Computational Lexical Semantics)

DC Library Short-term loan call number: UWD 1440



NO Lecture Monday July 1 Canada Day


Session 8: Discourse

Monday July 8

Readings:

Jurafsky and Martin, 21 (Computational Discourse)

DC Library Short-term loan call number: UWD 1440



SESSIONS 9 to 10 Information Manufacturing

Session 9: Information Extraction

Monday July 15

Readings:

Jurafsky and Martin, 22 (Information Extraction)

DC Library Short-term loan call number: UWD 1440



Session 10: TBA

Monday July 22 1:30-3:30 DC2306C

Guest Speaker and Readings:

TBA


FINAL SESSION Student Presentations

Session 11: Presentations

Monday July 29