This is a tentative schedule only. As the course progresses, the schedule will be adjusted.
| Lecture | Date | Topic | Readings (textbooks) |
|---|---|---|---|
| 1 | Jan 6 | Introduction to Artificial Intelligence (slides) | [RN3] Chapt. 1 and 2 |
| 2 | Jan 8 | Uninformed Search (slides) | [RN3] Sect. 3.1-3.4 |
| 3 | Jan 13 | Informed Search (slides, annotated slides) | [RN3] Sec. 3.5, 3.6 |
| 4 | Jan 15 | Constraint Satisfaction (slides) | [RN3] Sec 6.1-6.3 |
| 5 | Jan 20 | Uncertainty (slides, annotated slides) | [RN3] Sect. 13.1-13.5 |
| 6 | Jan 22 | Bayesian Networks (slides) | [RN3] Sections 14.1, 14.2, 14.4 |
| Jan 23 | Assignment 1 due (11:59 pm) | ||
| 7 | Jan 27 | Bayesian Networks (slides) | [RN3] Sections 14.1, 14.2, 14.4 |
| 8 | Jan 29 | Causal Inference (slides, annotated slides) | [P] Chapter 1 |
| 9 | Feb 3 | Intro to ML and Decision Tree Learning (slides, annotated slides) | [RN3] Sec 18.1-18.4 |
| 10 | Feb 5 | Statistical Learning (slides, annotated slides) | [RN3] Sec 20.1-20.2 |
| Feb 6 | Assignment 2 due (11:59 pm) | ||
| 11 | Feb 10 | Neural Networks (slides, annotated slides) | [RN3] Sec 18.7, [ZLLS] Chapter 5 |
| 12 | Feb 12 | Deep Neural Networks (slides, annotated slides) | [ZLLS] Chapter 5 |
| Feb 12 | Midterm (7:30 - 8:50 pm in M3-1006) | ||
| 13 | Feb 24 | Sequence modeling, HMMs (slides) | [RN3] Sec 15.1-15.3, 15.5 |
| 14 | Feb 26 | RNNs, Attention and Transformers (slides, annotated slides) | [ZLLS] Chapters 9, 11, [GBC] Chapter 10 |
| Feb 27 | Project proposal due at 11:59 pm (CS686 only) | ||
| 15 | Mar 3 | Markov Decision Processes (slides, annotated slides) | [RN3] Sections 17.1-17.4, [SB] Chapter 3 and Sections 4.1-4.2, [ZLLS] Sections 17.1-17.2 |
| 16 | Mar 5 | Intro to Reinforcement Learning (slides, annotated slides) | [RN3] Sections 21.1-21.3, [SB] Sections 6.1-6.5, [ZLLS] Section 17.3 |
| Mar 6 | Assignment 3 due (11:59 pm) | ||
| 17 | Mar 10 | Deep Reinforcement Learning (slides, annotated slides) | [SB] Sections 9.4, 9.7, [GBC] Chapters 6, 7, 8 |
| 18 | Mar 12 | Policy Gradient and Monte Carlo Tree Search (slides, annotated slides) | [SB] Sections 8.11, 13.1-13.3 |
| 19 | Mar 17 | Model-based RL and RL from Human Feedback (slides, annotated slides) | [SB] Sections 8.1-8.2 Ouyang, Wu, Jiang, Wainwright, et al. (2022) Training language models to follow instructions with human feedback, NeurIPS. |
| 20 | Mar 19 | LLM Test-time Inference and Reasoning (slides, annotated slides) | Rafailov, Sharma, Mitchell, Ermon, Manning, Finn (2023) Direct Preference Optimization: Your Language Model is Secretly a Reward Model, NeurIPS. Rashid, Wu, Fan, Li, Kristiadi, Poupart (2025) Towards Cost-Effective Reward Guided Text Generation, ICML. |
| Mar 20 | Assignment 4 due (11:59 pm) | ||
| 21 | Mar 24 | Game Theory (slides, annotated slides) | [RN3] Section 17.5, [SB] Sections 21.1-21.3 |
| 22 | Mar 26 | Multi-agent RL (slides, annotated slides) | Caroline Claus and Craig Boutilier (1998) The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems, AAAI. Michael Littman (1994) Markov games as a framework for multi-agent reinforcement learning, Machine learning proceedings. Junling Hu and Michael P. Wellman (2003) Nash Q-learning for General-Sum Stochastic Games, JMLR |
| 23 | Mar 31 | Image Generation (slides, annotated slides) | Steins (2022) Diffusion Models Clearly Explained Steins (2022) Stable Diffusion Clearly Explained Mohebbi, Abdulrahman, Miao, Poupart, Kothawade (2026) Image-POSER: Reflective RL for Multi-Expert Image Generation and Editing, arXiv. |
| 24 | Apr 2 | Course Wrap Up (slides) | |
| April 10 | Project report due at 11:59 pm (CS686 only) | ||