CS886 Spring 2013 - Schedule and Course Material

This is a tentative schedule only.  As the course progresses, the schedule will be adjusted.

References:

Date
Videos and Slides
Q&A Sessions E5-3052
Complementary Readings (optional)
Assignments and Papers (mandatory)
May 6
Welcome
Course logistics
none


May 8
Module 1: Course overview (video) (slides)
Module 2: Probability theory (video) (slides)
Module 3: Utility theory (video) (slides)
none


May 13
Module 4: Markov Processes (video) (slides)
Module 5: introduction to Markov Decision Processes (video) (slides)
Logistics, probabilities and utilities: recording


May 15
Module 6: Value Iteration (video) (slides)
Module 7: Policy Iteration (video) (slides)
Course overview, decision theory, Markov processes: recording

1) DP de Farias, B van Roy (2003) The linear programming approach to approximate dynamic programming, Operations Research 51(6), 850-865.  Critique deadline: June 4
May 20
Victoria day
none


May 22
Module 8: Linear Programing (video) (slides)
Markov decision processes: recording


May 27
no video
Value and policy iteration: recording


May 29
no video
Linear programming: recording


June 3
Module 9: LAO* (video) (slides)
Python: recording

A1 out June 3
Jun 5
no video
Discuss paper #1: recording

Jun 10
Module 10: online optimization (video) (slides)
LAO*: recording


Jun 12
no video
Marmoset and LAO*: recording


Jun 17
Module 11: intro to reinforcement learning (video) (slides)
Online optimization: recording

2) S Gelly, L Kocsis, M Schoenauer, M Sebag, D Silver, C Szepesvari, O Teytaud (2012) The grand challenge of Computer Go: Monte Carlo tree search and extensions, Communications of the ACM, Volume 55, Issue 3, pages 106-113.  Critique deadline: July 2
Jun 19
no video
games: recording

A1 due June 19
Jun 24
Module 12: bandits (video) (slides)
Intro to reinforcement learning: recording


Jun 26
no video
Nuts and bolts of TD learning: recording


Jul 1
Canada Day
none


Jul 3
Module 13: Bayesian bandits (video) (slides)
Discuss paper #2: recording


Jul 8
no video
Bandits: recording


Jul 10
no video
Bayesian bandits: recording


Jul 15
Module 14: Intro to POMDPs (slides)
Intro POMDPs: recording


Jul 17
no video
Point-based value iteration: recording

3) S Young, M Gasic, B Thomson and JD Williams (2013) POMDP-based Statistical Spoken Dialogue Systems: a Review, Proc IEEE, Vol 101, No. 5, 1160-1179.
deadline: Jul 29
Jul 22
Module 15: POMDP bounds (slides)
POMDP bounds: recording


Jul 24
no video
Bayesian RL: recording

A2 due July 25
Jul 29
Bayesian RL tutorial (slides)
Bayesian RL: recording


Jul 30
Monday's schedule (mandated by university)
Discuss paper #3: recording