Close

Stanford CS234: Reinforcement Learning

Stanford,, Winter 2019 , Prof. Emma Brunskill

Updated On 02 Feb, 19

Overview

To realize the dreams and impact of AI requires autonomous systems that learn to make good decisions. Reinforcement learning is one powerful paradigm for doing so, and it is relevant to an enormous range of tasks, including robotics, game playing, consumer modeling and healthcare. This class will provide a solid introduction to the field of reinforcement learning and students will learn about the core challenges and approaches, including generalization and exploration. Through a combination of lectures, and written and coding assignments, students will become well versed in key ideas and techniques for RL. Assignments will include the basics of reinforcement learning as well as deep reinforcement learning an extremely promising new area that combines deep learning techniques with reinforcement learning. In addition, students will advance their understanding and the field of RL through a final project.

Includes

On-demand Videos
Login & Track your progress
Full Lifetime acesses

Lecture 5: Stanford CS234: Reinforcement Learning | Winter 2019 | Lecture 5 - Value Function Approximation

4.1 ( 11 )

Lecture Details

For more information about Stanford’s Artificial Intelligence professional and graduate programs, visit: https://stanford.io/ai

Professor Emma Brunskill, Stanford University
http://onlinehub.stanford.edu/

Professor Emma Brunskill
Assistant Professor, Computer Science
Stanford AI for Human Impact Lab
Stanford Artificial Intelligence Lab
Statistical Machine Learning Group

To follow along with the course schedule and syllabus, visit: http://web.stanford.edu/class/cs234/index.html

0:00 Introduction
1:19 Class Structure
3:18 Value Function Approximation (VFA)
4:26 Motivation for VFA
5:01 Benefits of Generalization
10:03 Function Approximators
11:16 Review: Gradient Descent
13:47 Value Function Approximation for Policy Evaluation with an Oracle
15:11 Stochastic Gradient Descent
18:02 Model Free VFA Policy Evaluation
18:22 Model Free VFA Prediction / Policy Evaluation
19:06 Feature Vectors
30:06 MC Linear Value Function Approacimation for Policy Evaluation
35:48 Baird (1995)-Like Example with MC Policy Evaluation
43:55 Convergence Guarantees for Linear Value Function Approximation for Policy Evaluation: Preliminaries
50:43 Batch Monte Carlo Value Function Approximation
53:48 Recall: Temporal Difference Learning w/ Lookup Table
54:42 Temporal Difference (TD(0)) Learning with Value Function Approximation
57:40 TD(0) Linear Value Function Approximation for Policy Evaluation
58:10 Baird Example with TD(0) On Policy Evaluation