CS 886: Spring 2015: Home

CS 886: Topics in Natural Language Processing
Spring 2015
Home

Home |

INSTRUCTOR:

Ming Li DC 3355, x84659 mli@uwaterloo.ca

Course time and location: Wednesdays 2:00-4:50pm, DC 2568

Office hours: Wednesdays: 5-6pm, or by appointment
Reference Materials: Daniel Jurafsky and James H Martin, Speech and Language Processing, Prentice Hall, 2nd Ed, 2008, C. Manning and H. Schutze, Foundations of Statistical Natural Language Processing, MIT Press, 1999, and papers listed below.

INSTRUCTOR:
Ming Li	DC 3355, x84659	mli@uwaterloo.ca
Course time and location:	Wednesdays 2:00-4:50pm, DC 2568
Office hours:	Wednesdays: 5-6pm, or by appointment
Reference Materials:	Daniel Jurafsky and James H Martin, Speech and Language Processing, Prentice Hall, 2nd Ed, 2008, C. Manning and H. Schutze, Foundations of Statistical Natural Language Processing, MIT Press, 1999, and papers listed below.

The course material will be mainly from research papers. I will do a review of NLP for first week. Then the students will be grouped into 10 groups to present 10 directions of NLP, respectively, in class (2 and half hours each week). Each group chooses one of the following topics. The presenting group is responsible not only to read the papers I have given, but also to find and read more recent related papers and point out new discoveries, research trends and directions in the presentation. Other students are only responsible to read the materials given and write a 1/2 page summary before class each week. I will present a last lecture on a new theory of approximating semantics.

This course will focus on modern NLP techniques, especially statistical natural language processing, big data and deep learning, Question-answering, new theories.

Marking Scheme: Each student (except for the presenting students that week) is expected hand-in a 1/2 page summary of the papers to be presented in each week before that week (45 marks, 5 marks each summary), a presentation (educational) and final (critic or implementation of a system or original) project (45 marks), and 10 marks for class attendance and participation in discussions.

Course announcements and lecture notes will appear on this page. Please look at this page regularly.

Project topics:

Presentation: Michael Doroshenko and Alexey Karyakin. May 13. Language models, N-gram models, perplexity, smoothing methods.
- PF Brown, PV Desouza, RL Mercer, VJD Pietra: Class-based n-gram models of natural language. Computational Linguistics, 18:4(1992) 467-479.
- R. Kneser and H. Ney, Improved backing-off for m-gram language modeling ICASSP-95.
- KW Church and WA Gale, A comparison of the enhanced Good-Turing and deleted methods for estimating probabilities of English bigrams. Computer Speech and Language, 5:1(1991) 19-54.
Presentation: Besat Kassaie and Wanqi Li. June 3. LDA, Text classification, question classification, authorship attribution, author gender classification, academic paper classification
- DM Blei, AY Ng, MI Jordan, Latent dirichlet allocation. The journal of macine learning research, 2003.
Sentiment analysis. Presention by Dimitrios Skreptos, Luis Blanco, Sri Bolisetti. (Luis might cover summarization) and guest speaker Professor Fei Song. June 17+24
- X. Glorot, A. Bordes, Y. Bengio Domain adaptation for large-scale sentiment classification: A deep learning approach. (download from google scholar)
- B. O'Connor, R. Balasubramanyan, B.R. Routledge, N.A. Smith: From tweets to polls: linking text sentiment to public opinion time series, In ICWSM-2010.
- J. Bollen, H. Mao, X. Zeng, Twitter mood predicts the stock market. J. Computational Science, 2:1(2011), 1-8.
Big data and deep learning for NLP. Presentation by Dylan Drover, Borui Ye, and Jie Peng. Guest lecture by Professor Pascal Poupart. July 8, July 15
- R. Collobert, J. Weston, A unified architecture for natural language processing: deep neural networks with multitask learning. ICML'08, pp. 160-167.
- K. Vodraha: Deep learning for NLP, 2015 (google download)
- R. Socher, Y. Bengio, C. Manning, Deep learning for NLP, ACL 2012 (Google download.)
Summarization, passage retrieval.
- HP Luhn: The automatic creation of literature abstracts. IBM J. of Research and Development 2:2(1958), 159-165.
- CY Lin and E. Hovy: Automatic evaluation of summaries using n-gram co-occurrence statistics. NAACL'03 pp 71-78, 2003.
- Vanderwende et al: The PYTHY summarization system: Microsoft Research at DUC 2007. 2007. (doc download from google scholar).
Presentation: Hicham El-Zein and Jian Li. May 20. Statistical Translation.
- PF Brown, J Cocke, SAD Pietra, VJD Pietra: A statisitical approach to machine translation. Computational Linguistics, 16:79--85, 1990.
- PF Brown, VJD Pietra, SAD Pietra and RL Mercer: The mathematics of statistical machine translation: Parameter estimation. J. Computational Linguistics Vol 19, Issue 2, 1993, p. 263-311.
- Franz J. Och: Statistical machine translation: foundations and recent advances. Dec. 19, 2010. (Download from google) (This is the leader who has lead to create google translator, that is serving 200 million people daily as of Oct 2014.)
- Industry: Google translate, Bing translate.
NER -- Named entity recognition, part-of-speech tagging. Presentation by Omar Choudry and Yuguang Zhu, July 22
- A. Ratnaparkhi, A maximum entropy model for part-of-speech tagging. Proc. of Conf on empirical methods in NLP, 1996. pp 133-142.
- K. Toutanova, D. Klein, C.D. Manning, Y. Singer. Feature-rich part-of-speech tagging with a cyclic dependency network. NAACL'03, pp 173-180. 2003.
- T. Brants. TnT: a statistical part-of-speech tagger. Proc. of 6th Conf on Applied NLP. 2000. pp. 224-231.
- Optional: CRFs (J. Lafferty, A. McCallum, FCN Pereira, Conditional random fields: probabilistic models for segmenting and labeling sequence data. 2001. To replace the 3rd paper.)
(Ranked) Information retrieval, search engines. TF-IDF weighting, cosine distance.
- G. Salton, A. Wong, CS. Yang, A vector space model for automatic indexing. CACM, 1975. (SMART system, vector space model is presented here.)
Semantics, Thesaurus methods, information theory, and information distance. I will cover this topic, July 29.
- D. Lin: An information-theoretic definition of similarity. ICML, 1998.
- P. Resnik: Semantic similariity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. J. Artif. Intell. Res. (JAIR), 1999.
- JJ Jiang and DW Conrath: Semantic similarity based on corpus statistics and lexical taxonomy. arXiv:cmp-lg/9709008v1
Presentation: Bahareh Sarrafzdeh and Alexandra Vtyurina, June 10. Question answering.
- William Tunstall-Pedoe: True Knowledge: open-domain question answering using structured knowledge and inference. AI Magazine, 2010, p. 80-92.
- D. Ravichandran and E. Hovy: Learning surface text patterns for a question answering system. Proceedings of the 40th ACL, pp 41-47.
- Xin Li and Dan Roth, Learning question classifiers. COLING'02
- Systems: Siri, Wolfram Alpha, Evi, Google, Ask.com, Bing, NELL.
Presentation by Taras Mychaskiw and Aaron Voelker, May 27. Statistical parsing, the Penn Treebank.
- M.P. Marcus, M.A. Marcinkiewics, B. Santorini: Building a large annotated corpus of English: The Penn Treebank, Computational Linguistics, 1993.
- M. Collins, Head-driven statisical models for natural language parsing. Computational Linguistics, 2003.
- E. Chamiak, A maximum-entropy-inspired parser. NAACL 2000 Proceedings of the 1st North American chapter of the Assoc for Computational Linguistics Conf. pp 132-139.

Lectures:

Lecture 1. May 6: History, Overview. Basic notions and definitions. Good ideas in natural language processing.
Lecture 2.1. Language models, First part Lecture 2.2. Language models, Second part By Michael Doroshenko and Alexey Karyakin. May 13. Language models, N-gram models, perplexity, smoothing methods.
Lecture 3. Statistical translation Presentation by Hicham El-Zein and Jian Li. May 20. Statistical Translation.
Lecture 4. Statistical parsing Presentation by Taras Mychaskiw and Aaron Voelker, May 27. Statistical parsing, the Penn Treebank.
Lecture 5.1. LDA, Text classification. Lecture 5.2. Topic Models Presentation by Besat Kassaie and Wanqi Li. June 3. LDA, Text classification, question classification, authorship attribution, author gender classification, academic paper classification
Lecture 6. Question Answering, TrueKnowledge, Complex QA. Presentation by Bahareh Sarrafzadeh and Alexandra Vtyurina, June 10. Question answering.
Lecture 7.1. Sentiment Analysis Basics. Lecture 7.2. Sentiment Analysis Presentation by Dimitrios Skrepetos and Sri Bolisetti June 17.
Lecture 8. Sentiment Analysis, summarization, topic modeling Presented by Luis Blanco, and Guest Speaker Professor Fei Song from University of Guelph (on topic modeling applied to sentiment analysis). June 24.
Lecture 9. Deep learning Presentation by Dylan John Drover, Borui Ye, and Jie Peng. July 8
Lecture 10. Deep learning: RNN, Sentiment analysis Presented by Borui Ye. and Lecture 10. Deep learning: Multitask NLP CNN by Jie Peng, July 15. Ming Li present a CNN to play Go (by Clark and Storkey).
Guest lecture by Pascal Poupart, Guest lecture by Pascal Poupart. Guest lecture by Han Zhao Yuguang Presentation on POS-Tagging. Guest speaker: Professor Pascal Poupart (on Self-Adaptive hierarchical sentence modeling and sum-product networks). Named entity recognition Presented by Omar Choudry and Yuguang Zhang. July 22
Lecture 12. Presentation by Professor Chrysanne Di Marco: "The frequency of hedging cues in citation contexts in scientific writing". Presentation by Ming Li: A theory of approximating semantics. July 29.

Announcements:

Maintained by Ming Li