Speech 2: KHIPU 2019-Nando de Freitas: Reinforcement Learning

Apr 30, 2020

Slides

Outline:

RL Concepts
Policy gradients
Dynamic programming
Deep Q-networks
Distributional RL
D4PG
PPO and MPO
R2D3
Applications of RL
- AlphaX
- Batch RL