CS886 Spring 2013 - Schedule and Course Material

This is a tentative schedule only.  As the course progresses, the schedule will be adjusted.


Videos and Slides
Q&A Sessions E5-3052
Complementary Readings (optional)
Assignments and Papers (mandatory)
May 6
Course logistics

May 8
Module 1: Course overview (video) (slides)
Module 2: Probability theory (video) (slides)
Module 3: Utility theory (video) (slides)

May 13
Module 4: Markov Processes (video) (slides)
Module 5: introduction to Markov Decision Processes (video) (slides)
Logistics, probabilities and utilities: recording

May 15
Module 6: Value Iteration (video) (slides)
Module 7: Policy Iteration (video) (slides)
Course overview, decision theory, Markov processes: recording

1) DP de Farias, B van Roy (2003) The linear programming approach to approximate dynamic programming, Operations Research 51(6), 850-865.  Critique deadline: June 4
May 20
Victoria day

May 22
Module 8: Linear Programing (video) (slides)
Markov decision processes: recording

May 27
no video
Value and policy iteration: recording

May 29
no video
Linear programming: recording

June 3
Module 9: LAO* (video) (slides)
Python: recording

A1 out June 3
Jun 5
no video
Discuss paper #1: recording

Jun 10
Module 10: online optimization (video) (slides)
LAO*: recording

Jun 12
no video
Marmoset and LAO*: recording

Jun 17
Module 11: intro to reinforcement learning (video) (slides)
Online optimization: recording

2) S Gelly, L Kocsis, M Schoenauer, M Sebag, D Silver, C Szepesvari, O Teytaud (2012) The grand challenge of Computer Go: Monte Carlo tree search and extensions, Communications of the ACM, Volume 55, Issue 3, pages 106-113.  Critique deadline: July 2
Jun 19
no video
games: recording

A1 due June 19
Jun 24
Module 12: bandits (video) (slides)
Intro to reinforcement learning: recording

Jun 26
no video
Nuts and bolts of TD learning: recording

Jul 1
Canada Day

Jul 3
Module 13: Bayesian bandits (video) (slides)
Discuss paper #2: recording

Jul 8
no video
Bandits: recording

Jul 10
no video
Bayesian bandits: recording

Jul 15
Module 14: Intro to POMDPs (slides)
Intro POMDPs: recording

Jul 17
no video
Point-based value iteration: recording

3) S Young, M Gasic, B Thomson and JD Williams (2013) POMDP-based Statistical Spoken Dialogue Systems: a Review, Proc IEEE, Vol 101, No. 5, 1160-1179.
deadline: Jul 29
Jul 22
Module 15: POMDP bounds (slides)
POMDP bounds: recording

Jul 24
no video
Bayesian RL: recording

A2 due July 25
Jul 29
Bayesian RL tutorial (slides)
Bayesian RL: recording

Jul 30
Monday's schedule (mandated by university)
Discuss paper #3: recording