Date |
Videos and
Slides |
Q&A Sessions E5-3052 |
Complementary
Readings
(optional) |
Assignments
and
Papers (mandatory) |
May 6 |
Welcome Course logistics |
none |
||
May 8 |
Module 1: Course overview (video)
(slides)
Module 2: Probability theory (video) (slides) Module 3: Utility theory (video) (slides) |
none |
||
May 13 |
Module 4: Markov Processes (video)
(slides) Module 5: introduction to Markov Decision Processes (video) (slides) |
Logistics, probabilities and
utilities: recording |
|
|
May 15 |
Module 6: Value Iteration (video)
(slides) Module 7: Policy Iteration (video) (slides) |
Course overview, decision
theory, Markov processes: recording |
1) DP de Farias, B van Roy
(2003) The
linear programming approach to approximate dynamic
programming, Operations Research 51(6),
850-865. Critique deadline: June 4 |
|
May 20 |
Victoria day |
none |
||
May 22 |
Module 8: Linear Programing (video)
(slides) |
Markov decision processes: recording |
||
May 27 |
no video |
Value and policy iteration: recording |
||
May 29 |
no video |
Linear programming: recording |
||
June 3 |
Module 9: LAO* (video)
(slides) |
Python: recording |
A1 out June 3 |
|
Jun 5 |
no video |
Discuss paper #1: recording | ||
Jun 10 |
Module 10: online
optimization (video)
(slides) |
LAO*: recording |
||
Jun 12 |
no video |
Marmoset and LAO*: recording |
||
Jun 17 |
Module 11: intro to
reinforcement learning (video)
(slides) |
Online optimization: recording |
2) S Gelly, L Kocsis, M
Schoenauer, M Sebag, D Silver, C Szepesvari, O Teytaud
(2012) The
grand challenge of Computer Go: Monte Carlo tree search
and extensions, Communications of the ACM, Volume 55,
Issue 3, pages 106-113. Critique deadline: July 2 |
|
Jun 19 |
no video |
games: recording |
A1 due June 19 | |
Jun 24 |
Module 12: bandits (video)
(slides) |
Intro to reinforcement
learning: recording |
||
Jun 26 |
no video |
Nuts and bolts of TD
learning: recording |
||
Jul 1 |
Canada Day |
none |
||
Jul 3 |
Module 13: Bayesian bandits (video)
(slides) |
Discuss paper #2: recording |
||
Jul 8 |
no video |
Bandits: recording |
||
Jul 10 |
no video |
Bayesian bandits: recording |
||
Jul 15 |
Module 14: Intro to POMDPs (slides) |
Intro POMDPs: recording |
||
Jul 17 |
no video |
Point-based value iteration:
recording |
3) S Young, M Gasic, B
Thomson and JD Williams (2013) POMDP-based
Statistical Spoken Dialogue Systems: a Review, Proc
IEEE, Vol 101, No. 5, 1160-1179. deadline: Jul 29 |
|
Jul 22 |
Module 15: POMDP bounds (slides) |
POMDP bounds: recording |
||
Jul 24 |
no video |
Bayesian RL: recording |
A2 due July 25 |
|
Jul 29 |
Bayesian RL tutorial (slides) |
Bayesian RL: recording |
||
Jul 30 |
Monday's schedule (mandated
by university) |
Discuss paper #3: recording |