Figure 22.1 - Components of an RL system.png | 14.9 KB | |
Figure 22.2 - The recursive relationship expressed by the Bellman equation.png | 13.2 KB | |
Figure 22.3 - Convergence of policy evaluation and improvement.png | 23.1 KB | |
Figure 22.4 - 3×4 gridworld rewards, value function, and optimal policy.png | 20.4 KB | |
Figure 22.5 - RL agent's behavior during the Lunar Lander episode.png | 17.2 KB | |
Figure 22.6 - The DDQN agent's performance in the Lunar Lander environment.png | 448.6 KB | |
Figure 22.7 - Trading agent performance relative to the market.png | 278.9 KB | |