CS 842: Functional Parsing

Logistics

There is no textbook for CS 842. We will be reading a number of papers from the list below.

The introductory lectures are arranged topically into modules of various lengths. Presentation slides for the modules will be available on the Handouts page.

There will be a small number of programming assignments in the first part of the course, using Racket and Haskell. Details will be posted on the Assignments page.

The software used in the course is Racket and Haskell, both available for multiple platforms.

Students will do one or more presentations (depending on class enrollment) of one or more of the listed papers or other relevant papers, and a final project, which can be an implementation, an experience report in an area of application, or a survey paper. A proposal for the final project of one to three typeset pages is due on October 2. Presentations will start in October sometime. The final project is due on December 2.

There will be no class meeting on September 25 because your instructor will be away at a conference.

Papers

In process of being added.

This list is not exhaustive, and we may not read every paper on it. We will also pay different amounts of attention to different papers. The list is roughly divided up by topic; each topic typically has a number of follow-up papers not listed, which may be discussed during class meeting time or may be appropriate for projects. No chronological ordering is implied.

Derivatives

Scott Owens, John Reppy, and Aaron Turon. Regular-expression derivatives re-examined. Journal of Functional Programming 19, 02 (2009).

Nils Anders Daniellson. Total parser combinators. Proc. 15th International Conference on Functional Programming (2010).

Matthew Might, David Darais, and Daniel Spiewak. Parsing with derivatives. Proc. 16th International Conference on Functional Programming (2011).

Packrat parsing

Bryan Ford. Packrat parsing: simple, powerful, lazy, linear time. Proc. 7th International Conference on Functional Programming (2002).

Bryan Ford. Parsing Expression Grammars: A recognition-based syntactic foundation. Proc. 31st ACM Symposium on Principles of Programming Languages (2004).

Recursive descent

Peter Norvig. Techniques for automatic memoization with applications to context-free parsing. Computational Linguistics, 17, 1 (1991).

Mark Johnson. Memoization of top-down parsing. Computational Linguistics, 21, 3 (1995).

Tom Ridge. Simple, functional, sound and complete parsing for all context-free grammars. Proc. 1st International Conference on Certified Programs and Proofs (2011).

Parser combinators

Philip Wadler. How to replace failure by a list of successes: A method for exception handling, backtracking, and pattern matching in lazy functional languages. Proc. Functional Programming Languages and Computer Architectures (1985).

Graham Hutton and Eric Meijer. Monadic parser combinators. NOTTCS-TR-96-4 (1996).

Doaitse Swierstra. Combinator parsers: from toys to tools. Proc. 2000 Haskell Workshop (2000).

Doaitse Swierstra. Combinator Parsing: A Short Tutorial. Technical Report UU-CS-2008-44, Department of Information and Computing Sciences, Universiteit Utrecht (2008).

Daan Leijen and Erik Meijer. Parsec: Direct Style Monadic Parser Combinators for the Real World. Technical Report UU-CS-2001-35, Department of Information and Computing Sciences, Universiteit Utrecht (2001).

(Here is the Utrecht CS Technical Report Web page.)

Earley parsers

Jay Earley. An efficient context-free parsing algorithm. Communications of the ACM 13, 2 (1970).

Joop Leo. A general context-free parsing algorithm running in linear time on every LR(k) grammar without using lookahead. Theoretical Computer Science 82 (1991).

John Aycock and Nigel Horspool. Practical Earley parsing. The Computer Journal, 45, 6 (2002).

Jeffrey Kegler. Marpa, a practical general parser: the recognizer (on his web page).

Recursive ascent and LR parsing

Rene Leermakers. Recursive ascent: from Earley to Marcus. Theoretical Computer Science 104 (1992).

Peter Pepper. LR Parsing = Grammar Transformation + LL Parsing - Making LR Parsing More Understandable And More Efficient. Manuscript available on CiteSeerX (1999).

Ralf Hinze and Ross Patterson. Derivation of a typed functional LR parser. Manuscript available on Hinze's web page (2005).

More papers may be posted.