CS886 Winter10 - Schedule

This is a tentative schedule only. As the course progresses, the schedule will be adjusted.

References:

[RN] Russel and Norvig, Artificial Intelligence: A Modern Approach, 2nd Edition, 2003 (on reserve at the library)
[SB] Sutton & Barto, Reinforcement Learning
[P] Puterman, Markov Decision Processes (1994)

Lecture	Topics	Complementary Readings	Assigned Readings
Jan 7	Course overview
Jan 12	Uncertainty: Bayesian Networks (BNs) (slides)	[RN] Chapter 13, Sections 14.1-4
Jan 14	Time: Dynamic Bayesian Networks (DBNs) (slides)	[RN] Chapter 15
Jan 19	Decisions: Decision Networks (DNs) (slides)	[RN] Chapter 16,
Jan 21	Decisions + time: Markov Decision Processes (MDPs) (slides)	[RN] Sections 17.1-3 [P] entire book
Jan 26	Partial observability: Partially Observable Markov Decision Processes (POMDPs) (slides)	[RN] Section 17.4-5
Jan 28	Adaptivity: Reinforcement Learning (slides)	[RN] Chapter 21
Feb 2	Multi-agent: Game Theory (slides)	[RN] Section 17.6
Feb 4	Approximate reasoning: multiplicative decomposition and additive decompositions (basis functions)
Feb 9	Approximate reasoning: Markov Chain Monte Carlo		1) Murphy, Weiss, The Factored Frontier Algorithms for Approximate Inference in DBNs, UAI, 2001. (Presenter: Siavash Rahbar Noudehi) 2) Guestrin, Koller, Parr, Venkataraman, Efficient Solution Algorithms for Factored MDPs. JAIR, 19 : 399-468 (2003) (Presenter: Siavash Rahbar Noudehi)
Feb 11	Approximate reasoning: Sequential Monte Carlo		project proposal due
Feb 23	Hierarchical Models: hierarchical HMMs & DBNs		3) Isaard and Blake, Condensation -- Conditional density propagation for visual tracking, IJCV 29 (1) 5-28, 1998 (Presenter: Leong Fong) 4) Fox, Thrun, Burgard, Delleart, Particle Filters for Mobile Robot Localization, In Doucet, de Freitas and Gordon, editors, Sequential Monte Carlo Methods in Practice, p 499-516, 2001 (Presenter: Laleh Soltan Ghoraie)
Feb 25	Hierarchical Models: MDPs, POMDPs, RL (slidesA, slidesB)
Mar 2	Relational models: relational Bayesian networks (slides)		5) Fine, Singer, Tishby, The Hierarchical Hidden Markov Model: Analysis and Applications, Machine Learning, 32, 41–62 (1998) (Presenters: Xiling Chen and Dong Han) 6) Marc Toussaint, Laurent Charlin and Pascal Poupart, Hierarchical POMDP Controller Optimization by Likelihood Maximization, UAI, 2008
Mar 4	Relational models: first-order MDPs
Mar 9	Continuous time models: continuous DBNs and MDPs		7) Rodrigo de Salvo Braz and Eyal Amir and Dan Roth, Lifted First Order Probabilistic Inference, IJCAI 2005. (Presenters: Zhiping Wu and Yuxin Yu) 8) Wang, Joshi, Kardon, First Order Decision Diagrams for Relational MDPs, JAIR 31 (2008) 431-472
Mar 11	Continuous state and action models: control
Mar 16	Bayesian learning: Bayesian reinforcement learning (slides)		9) Nodelman, Koller, Shelton, Expectation Propagation for Continuous Time Bayesian Networks, UAI, 2005 (Presenters: Mohamad Ahmadi and Omid Ardakanian) 10) Hoffman, de Freitas, Doucet, Peters, An Expectation Maximization Algorithm for Continuous Markov Decision Processes with Arbitrary Rewards, AISTATS, 2009
Mar 18	Bayesian learning: Bayesian multi-agent learning
Mar 23	kernel methods: value function approximation		11) Pascal Poupart, Nikos Vlassis, Jesse Hoey and Kevin Regan, An Analytic Solution to Discrete Bayesian Reinforcement Learning, ICML 2006 12) Piotr Gmytrasiewicz and Prashant Doshi, "A Framework for Sequential Planning in Multiagent Settings", in Journal of AI Research (JAIR), Vol 24: 49-79, 2005 (Presenters: Scott Maclean and Omar Nafees)
Mar 25	kernel methods: Gaussian processes
Mar 30	Decentralized cooperative multi-agent models		13) Engel, Mannor, Meir, Reinforcement learning with Gaussian processes, ICML 2005 (Presenters: Trever Bekolay and Ricardo Salmon) 14) Taylor, Parr, Kernelized Value Function Approximation for Reinforcement Learning, ICML 2009 (Presenters: Fares Al-Qunaieer and Ahmed Othman)
Apr 1	Course wrap-up		15) Zinkevich, Johanson, Bowling, Piccione, Regret Minimization in Games with Incomplete Information, NIPS, 2007 (Presenter: Arthur Carvalho) 16) Bernstein, Amato, Hansen and Zilberstein, Policy Iteration for Decentralized Control of Markov Decision Processes, JAIR, 2009 (Presenter: David Fagnan)
April 12	projet due