CoCalc -- README.md

rasbt

GitHub Repository: rasbt/machine-learning-book
Path: blob/main/ch19/README.md
¹²⁴⁵ views

Chapter 19: Reinforcement Learning for Decision Making in Complex Environments

Chapter Outline

Introduction: learning from experience
- Understanding reinforcement learning
- Defining the agent-environment interface of a reinforcement learning system
- The theoretical foundations of RL
  - Markov decision processes
  - The mathematical formulation of Markov decision processes
  - Visualization of a Markov process
  - Episodic versus continuing tasks
- RL terminology: return, policy, and value function
  - The return
  - Policy
  - Value function
- Dynamic programming using the Bellman equation
Reinforcement learning algorithms
- Dynamic programming
  - Policy evaluation – predicting the value function with dynamic programming
  - Improving the policy using the estimated value function
  - Policy iteration
  - Value iteration
- Reinforcement learning with Monte Carlo
  - State-value function estimation using MC
  - Action-value function estimation using MC
  - Finding an optimal policy using MC control
  - Policy improvement – computing the greedy policy from the action-value function
- Temporal difference learning
  - TD prediction
  - On-policy TD control (SARSA)
  - Off-policy TD control (Q-learning)
Implementing our first RL algorithm
- Introducing the OpenAI Gym toolkit
  - Working with the existing environments in OpenAI Gym
- A grid world example
  - Implementing the grid world environment in OpenAI Gym
- Solving the grid world problem with Q-learning
  - Implementing the Q-learning algorithm
A glance at deep Q-learning
- Training a DQN model according to the Q-learning algorithm
  - Replay memory
  - Determining the target values for computing the loss
- Implementing a deep Q-learning algorithm
Chapter and book summary

Please refer to the README.md file in ../ch01 for more information about running the code examples.

Chapter 19: Reinforcement Learning for Decision Making in Complex Environments

Chapter Outline

Product

Resources

Company