CS886 Winter10 - Schedule

This is a tentative schedule only.  As the course progresses, the schedule will be adjusted.

References:


Lecture
Topics
Complementary Readings
Assigned Readings
Jan 7
Course overview


Jan 12
Uncertainty: Bayesian Networks (BNs) (slides)
[RN] Chapter 13, Sections 14.1-4


Jan 14
Time: Dynamic Bayesian Networks (DBNs) (slides) [RN] Chapter 15

Jan 19
Decisions: Decision Networks (DNs) (slides)
[RN] Chapter 16,


Jan 21
Decisions + time: Markov Decision Processes (MDPs) (slides)
[RN] Sections 17.1-3
[P] entire book

Jan 26
Partial observability: Partially Observable Markov Decision Processes (POMDPs) (slides)
[RN] Section 17.4-5

Jan 28
Adaptivity: Reinforcement Learning (slides)
[RN] Chapter 21

Feb 2
Multi-agent: Game Theory (slides)
[RN] Section 17.6
Feb 4
Approximate reasoning: multiplicative decomposition and additive decompositions (basis functions)


Feb 9
Approximate reasoning: Markov Chain Monte Carlo

1) Murphy, Weiss, The Factored Frontier Algorithms for Approximate Inference in DBNs, UAI, 2001. (Presenter: Siavash Rahbar Noudehi)
2) Guestrin, Koller, Parr, Venkataraman, Efficient Solution Algorithms for Factored MDPs. JAIR, 19 : 399-468 (2003)
(Presenter: Siavash Rahbar Noudehi)
Feb 11
Approximate reasoning: Sequential Monte Carlo

project proposal due
Feb 23
Hierarchical Models: hierarchical HMMs & DBNs

3) Isaard and Blake, Condensation -- Conditional density propagation for visual tracking, IJCV 29 (1) 5-28, 1998 (Presenter: Leong Fong)
4) Fox, Thrun, Burgard, Delleart, Particle Filters for Mobile Robot Localization, In Doucet, de Freitas and Gordon, editors, Sequential Monte Carlo Methods in Practice, p 499-516, 2001 (Presenter: Laleh Soltan Ghoraie)
Feb 25
Hierarchical Models: MDPs, POMDPs, RL (slidesA, slidesB)


Mar 2
Relational models: relational Bayesian networks (slides)

5) Fine, Singer, Tishby, The Hierarchical Hidden Markov Model: Analysis and Applications, Machine Learning, 32, 41–62 (1998) (Presenters: Xiling Chen and Dong Han)
6) Marc Toussaint, Laurent Charlin and Pascal Poupart, Hierarchical POMDP Controller Optimization by Likelihood Maximization, UAI, 2008
Mar 4
Relational models: first-order MDPs


Mar 9
Continuous time models: continuous DBNs and MDPs
7) Rodrigo de Salvo Braz and Eyal Amir and Dan Roth, Lifted First Order Probabilistic Inference, IJCAI 2005. (Presenters: Zhiping Wu and Yuxin Yu)
8) Wang, Joshi, Kardon, First Order Decision Diagrams for Relational MDPs, JAIR 31 (2008) 431-472
Mar 11
Continuous state and action models: control

Mar 16
Bayesian learning: Bayesian reinforcement learning (slides)
9) Nodelman, Koller, Shelton, Expectation Propagation for Continuous Time Bayesian Networks, UAI, 2005 (Presenters: Mohamad Ahmadi and Omid Ardakanian)
10) Hoffman, de Freitas, Doucet, Peters, An Expectation Maximization Algorithm for Continuous Markov Decision Processes with Arbitrary Rewards, AISTATS, 2009
Mar 18
Bayesian learning: Bayesian multi-agent learning

Mar 23
kernel methods: value function approximation

11) Pascal Poupart, Nikos Vlassis, Jesse Hoey and Kevin Regan, An Analytic Solution to Discrete Bayesian Reinforcement Learning, ICML 2006
12) Piotr Gmytrasiewicz and Prashant Doshi, "A Framework for Sequential Planning in Multiagent Settings", in Journal of AI Research (JAIR), Vol 24: 49-79, 2005 (Presenters: Scott Maclean and Omar Nafees)
Mar 25
kernel methods: Gaussian processes


Mar 30
Decentralized cooperative multi-agent models

13) Engel, Mannor, Meir, Reinforcement learning with Gaussian processes, ICML 2005 (Presenters: Trever Bekolay and Ricardo Salmon)
14) Taylor, Parr, Kernelized Value Function Approximation for Reinforcement Learning, ICML 2009 (Presenters: Fares Al-Qunaieer and Ahmed Othman)
Apr 1
Course wrap-up

15) Zinkevich, Johanson, Bowling, Piccione, Regret Minimization in Games with Incomplete Information, NIPS, 2007 (Presenter: Arthur Carvalho)
16) Bernstein, Amato, Hansen and Zilberstein, Policy Iteration for Decentralized Control of Markov Decision Processes, JAIR, 2009 (Presenter:  David Fagnan)
April 12
projet due