Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
greyhatguy007
GitHub Repository: greyhatguy007/Machine-Learning-Specialization-Coursera
Path: blob/main/C3 - Unsupervised Learning, Recommenders, Reinforcement Learning/week3/optional-labs/State-action value function example.ipynb
3565 views
Kernel: Python 3

State Action Value Function Example

In this Jupyter notebook, you can modify the mars rover example to see how the values of Q(s,a) will change depending on the rewards and discount factor changing.

import numpy as np from utils import *
# Do not modify num_states = 6 num_actions = 2
terminal_left_reward = 100 terminal_right_reward = 40 each_step_reward = 0 # Discount factor gamma = 0.5 # Probability of going in the wrong direction misstep_prob = 0.4
generate_visualization(terminal_left_reward, terminal_right_reward, each_step_reward, gamma, misstep_prob)
Image in a Jupyter notebookImage in a Jupyter notebook