Lecture |
Topics |
Complementary
Readings |
Assigned
Readings |
Jan 7 |
Course overview |
||
Jan 12 |
Uncertainty: Bayesian Networks
(BNs) (slides) |
[RN] Chapter 13, Sections 14.1-4 |
|
Jan 14 |
Time: Dynamic Bayesian Networks (DBNs) (slides) | [RN] Chapter 15 |
|
Jan 19 |
Decisions: Decision Networks
(DNs) (slides) |
[RN] Chapter 16, |
|
Jan 21 |
Decisions + time: Markov
Decision Processes (MDPs) (slides) |
[RN] Sections 17.1-3 [P] entire book |
|
Jan 26 |
Partial observability:
Partially Observable Markov Decision Processes (POMDPs) (slides) |
[RN] Section 17.4-5 |
|
Jan 28 |
Adaptivity: Reinforcement
Learning (slides) |
[RN] Chapter 21 |
|
Feb 2 |
Multi-agent: Game Theory (slides) |
[RN] Section 17.6 | |
Feb 4 |
Approximate reasoning:
multiplicative decomposition and additive
decompositions (basis functions) |
||
Feb 9 |
Approximate reasoning: Markov
Chain Monte Carlo |
1) Murphy, Weiss, The
Factored Frontier Algorithms for Approximate Inference in DBNs,
UAI, 2001. (Presenter: Siavash
Rahbar Noudehi) 2) Guestrin, Koller, Parr, Venkataraman, Efficient Solution Algorithms for Factored MDPs. JAIR, 19 : 399-468 (2003) (Presenter: Siavash Rahbar Noudehi) |
|
Feb 11 |
Approximate reasoning:
Sequential Monte Carlo |
project proposal due | |
Feb 23 |
Hierarchical Models:
hierarchical HMMs & DBNs |
3) Isaard and Blake, Condensation
-- Conditional density propagation for visual tracking, IJCV 29 (1)
5-28, 1998 (Presenter: Leong Fong) 4) Fox, Thrun, Burgard, Delleart, Particle Filters for Mobile Robot Localization, In Doucet, de Freitas and Gordon, editors, Sequential Monte Carlo Methods in Practice, p 499-516, 2001 (Presenter: Laleh Soltan Ghoraie) |
|
Feb 25 |
Hierarchical Models: MDPs,
POMDPs, RL (slidesA, slidesB) |
||
Mar 2 |
Relational models: relational
Bayesian networks (slides) |
5) Fine, Singer, Tishby, The
Hierarchical Hidden Markov Model: Analysis and Applications,
Machine Learning, 32, 41–62 (1998) (Presenters:
Xiling Chen and Dong Han) 6) Marc Toussaint, Laurent Charlin and Pascal Poupart, Hierarchical POMDP Controller Optimization by Likelihood Maximization, UAI, 2008 |
|
Mar 4 |
Relational models: first-order
MDPs |
||
Mar 9 |
Continuous time models: continuous DBNs and MDPs | 7) Rodrigo de Salvo Braz and
Eyal Amir and Dan Roth, Lifted First
Order Probabilistic Inference, IJCAI 2005. (Presenters: Zhiping Wu and Yuxin Yu) 8) Wang, Joshi, Kardon, First Order Decision Diagrams for Relational MDPs, JAIR 31 (2008) 431-472 |
|
Mar 11 |
Continuous state and action models: control | ||
Mar 16 |
Bayesian learning: Bayesian reinforcement learning (slides) | 9) Nodelman, Koller, Shelton, Expectation
Propagation for Continuous Time Bayesian Networks, UAI, 2005 (Presenters: Mohamad Ahmadi and Omid
Ardakanian) 10) Hoffman, de Freitas, Doucet, Peters, An Expectation Maximization Algorithm for Continuous Markov Decision Processes with Arbitrary Rewards, AISTATS, 2009 |
|
Mar 18 |
Bayesian learning: Bayesian multi-agent learning | ||
Mar 23 |
kernel methods: value function
approximation |
11) Pascal Poupart, Nikos
Vlassis, Jesse Hoey and Kevin Regan, An
Analytic Solution to Discrete Bayesian Reinforcement Learning, ICML
2006 12) Piotr Gmytrasiewicz and Prashant Doshi, "A Framework for Sequential Planning in Multiagent Settings", in Journal of AI Research (JAIR), Vol 24: 49-79, 2005 (Presenters: Scott Maclean and Omar Nafees) |
|
Mar 25 |
kernel methods: Gaussian
processes |
||
Mar 30 |
Decentralized cooperative
multi-agent models |
13) Engel, Mannor, Meir, Reinforcement
learning with Gaussian processes, ICML 2005 (Presenters: Trever Bekolay and Ricardo
Salmon) 14) Taylor, Parr, Kernelized Value Function Approximation for Reinforcement Learning, ICML 2009 (Presenters: Fares Al-Qunaieer and Ahmed Othman) |
|
Apr 1 |
Course wrap-up |
15) Zinkevich, Johanson,
Bowling, Piccione, Regret
Minimization in Games with Incomplete Information, NIPS, 2007 (Presenter: Arthur Carvalho) 16) Bernstein, Amato, Hansen and Zilberstein, Policy Iteration for Decentralized Control of Markov Decision Processes, JAIR, 2009 (Presenter: David Fagnan) |
|
April 12 |
projet due |