Introduction to Reinforcement Learning
Spring 2022
This page presents lecture materials for CS 4789/5789: Introduction to Reinforcement Learning taught by Sarah Dean at Cornell University in spring 2022. For the most recent materials look here . This course was first taught by Wen Sun in spring 2021.
Schedule
Date | no. | Lecture Title | Materials |
1/24 | 1 | Introduction to RL |
Lecture Notes
Slides, Live Notes, Video |
1/26 | 2 | MDPs and Bellman Equations |
Lecture Notes
Slides, Live Notes, Video |
1/31 | 3 | MDPs, Optimal Policies, and Value Iteration |
Lecture Notes
Slides, Live Notes, Video |
2/2 | 4 | Policy Iteration and Dynamic Programming |
Lecture Notes
Slides, Live Notes, Video |
2/7 | 5 | Continuous Control |
Lecture Notes
Slides, Live Notes, Video |
2/9 | 6 | Linear Quadratic Regulation |
Lecture Notes
Slides, Live Notes, Video |
2/14 | 7 | Nonlinear Control |
Lecture Notes
Slides, Live Notes, Video |
2/16 | 8 | Limitations in Control and Observation |
Lecture Notes
Slides, Live Notes, Video |
2/21 | 9 | Prediction and Estimation |
Lecture Notes
Slides, Live Notes, Video |
2/23 | 10 | Model-based RL |
Lecture Notes
Slides, Live Notes, Video |
2/28 | February Break | ||
3/2 | 11 | Approximate and Conservative Policy Iteration |
Lecture Notes
Slides, Live Notes, Video |
3/7 | 12 | Supervision via Bellman |
Lecture Notes
Slides, Live Notes, Video |
3/9 | 13 | Optimization Background |
Lecture Notes
Slides, Live Notes, Video |
3/14 | 14 | Policy Optimization: Random Search and Policy Gradient |
Lecture Notes Slides, Live Notes, Video |
3/16 | 15 | Policy Optimization: Trust Region and Natural PG |
Lecture Notes Slides, Live Notes, Video |
3/21 | 16 | Prelim Review | Slides, Video |
3/23 | 17 | Exploration: Multi-Armed Bandits |
Lecture Notes Slides, Live Notes, Video Code, Notebook |
3/28 | 18 | Upper Confidence Bound Algorithm |
Lecture Notes Slides, Live Notes, Video |
3/30 | 19 | Contextual Bandits |
Lecture Notes Slides, Live Notes, Video |
4/4 | Spring Break | ||
4/6 | Spring Break | ||
4/11 | 20 | Linear Contextual Bandits |
Lecture Notes Slides, Live Notes, Video Code, Notebook |
4/13 | 21 | Exploration in MDPs |
Lecture Notes Slides, Live Notes, Video |
4/18 | 22 | Imitation Learning with BC |
Lecture Notes Slides, Live Notes, Video |
4/21 | 23 | Interactive Imitation Learning |
Lecture Notes Slides, Live Notes, Video |
4/25 | 24 | Inverse RL |
Lecture Notes Slides, Live Notes, Video |
4/27 | 25 | Max Entropy IRL |
Lecture Notes Slides, Live Notes, Video |
5/2 | 26 | Specification and Societal Implications | Slides, Video |
5/4 | 27 | AlphaGo Case Study | Slides, Video |
5/9 | 28 | Review | Slides, Video |