CS 886: Topics in Natural Language Processing
Spring 2015

Home |

Ming Li DC 3355, x84659 mli@uwaterloo.ca
Course time and location:Wednesdays 2:00-4:50pm, DC 2568
Office hours:Wednesdays: 5-6pm, or by appointment
Reference Materials:Daniel Jurafsky and James H Martin, Speech and Language Processing, Prentice Hall, 2nd Ed, 2008, C. Manning and H. Schutze, Foundations of Statistical Natural Language Processing, MIT Press, 1999, and papers listed below.

The course material will be mainly from research papers. I will do a review of NLP for first week. Then the students will be grouped into 10 groups to present 10 directions of NLP, respectively, in class (2 and half hours each week). Each group chooses one of the following topics. The presenting group is responsible not only to read the papers I have given, but also to find and read more recent related papers and point out new discoveries, research trends and directions in the presentation. Other students are only responsible to read the materials given and write a 1/2 page summary before class each week. I will present a last lecture on a new theory of approximating semantics.

This course will focus on modern NLP techniques, especially statistical natural language processing, big data and deep learning, Question-answering, new theories.

Marking Scheme: Each student (except for the presenting students that week) is expected hand-in a 1/2 page summary of the papers to be presented in each week before that week (45 marks, 5 marks each summary), a presentation (educational) and final (critic or implementation of a system or original) project (45 marks), and 10 marks for class attendance and participation in discussions.

Course announcements and lecture notes will appear on this page. Please look at this page regularly.

Project topics:

  1. Presentation: Michael Doroshenko and Alexey Karyakin. May 13. Language models, N-gram models, perplexity, smoothing methods.
  2. Presentation: Besat Kassaie and Wanqi Li. June 3. LDA, Text classification, question classification, authorship attribution, author gender classification, academic paper classification
  3. Sentiment analysis. Presention by Dimitrios Skreptos, Luis Blanco, Sri Bolisetti. (Luis might cover summarization) and guest speaker Professor Fei Song. June 17+24
  4. Big data and deep learning for NLP. Presentation by Dylan Drover, Borui Ye, and Jie Peng. Guest lecture by Professor Pascal Poupart. July 8, July 15
  5. Summarization, passage retrieval.
  6. Presentation: Hicham El-Zein and Jian Li. May 20. Statistical Translation.
  7. NER -- Named entity recognition, part-of-speech tagging. Presentation by Omar Choudry and Yuguang Zhu, July 22
  8. (Ranked) Information retrieval, search engines. TF-IDF weighting, cosine distance.
  9. Semantics, Thesaurus methods, information theory, and information distance. I will cover this topic, July 29.
  10. Presentation: Bahareh Sarrafzdeh and Alexandra Vtyurina, June 10. Question answering.
  11. Presentation by Taras Mychaskiw and Aaron Voelker, May 27. Statistical parsing, the Penn Treebank.



Maintained by Ming Li