Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.
Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.
Path: blob/master/notebooks/chap02.ipynb
Views: 531
Modeling and Simulation in Python
Chapter 2
Copyright 2017 Allen Downey
Modeling a bikeshare system
We'll start with a State
object that represents the number of bikes at each station.
When you display a State
object, it lists the state variables and their values:
We can access the state variables using dot notation.
Exercise: What happens if you spell the name of a state variable wrong? Edit the previous cell, change the spelling of wellesley
, and run the cell again.
The error message uses the word "attribute", which is another name for what we are calling a state variable.
Exercise: Add a third attribute called babson
with initial value 0, and display the state of bikeshare
again.
Updating
We can use the update operators +=
and -=
to change state variables.
If we display bikeshare
, we should see the change.
Of course, if we subtract a bike from olin
, we should add it to wellesley
.
Functions
We can take the code we've written so far and encapsulate it in a function.
When you define a function, it doesn't run the statements inside the function, yet. When you call the function, it runs the statements inside.
One common error is to omit the parentheses, which has the effect of looking up the function, but not calling it.
The output indicates that bike_to_wellesley
is a function defined in a "namespace" called __main__
, but you don't have to understand what that means.
Exercise: Define a function called bike_to_olin
that moves a bike from Wellesley to Olin. Call the new function and display bikeshare
to confirm that it works.
Conditionals
modsim.py
provides flip
, which takes a probability and returns either True
or False
, which are special values defined by Python.
The Python function help
looks up a function and displays its documentation.
In the following example, the probability is 0.7 or 70%. If you run this cell several times, you should get True
about 70% of the time and False
about 30%.
In the following example, we use flip
as part of an if statement. If the result from flip
is True
, we print heads
; otherwise we do nothing.
With an else clause, we can print heads or tails depending on whether flip
returns True
or False
.
Step
Now let's get back to the bikeshare state. Again let's start with a new State
object.
Suppose that in any given minute, there is a 50% chance that a student picks up a bike at Olin and rides to Wellesley. We can simulate that like this.
And maybe at the same time, there is also a 40% chance that a student at Wellesley rides to Olin.
We can wrap that code in a function called step
that simulates one time step. In any given minute, a student might ride from Olin to Wellesley, from Wellesley to Olin, or both, or neither, depending on the results of flip
.
Since this function takes no parameters, we call it like this:
Parameters
As defined in the previous section, step
is not as useful as it could be, because the probabilities 0.5
and 0.4
are "hard coded".
It would be better to generalize this function so it takes the probabilities p1
and p2
as parameters:
Now we can call it like this:
Exercise: At the beginning of step
, add a print statement that displays the values of p1
and p2
. Call it again with values 0.3
, and 0.2
, and confirm that the values of the parameters are what you expect.
For loop
Before we go on, I'll redefine step
without the print statements.
And let's start again with a new State
object:
We can use a for
loop to move 4 bikes from Olin to Wellesley.
Or we can simulate 4 random time steps.
If each step corresponds to a minute, we can simulate an entire hour like this.
After 60 minutes, you might see that the number of bike at Olin is negative. We'll fix that problem in the next notebook.
But first, we want to plot the results.
TimeSeries
modsim.py
provides an object called a TimeSeries
that can contain a sequence of values changing over time.
We can create a new, empty TimeSeries
like this:
And we can add a value to the TimeSeries
like this:
The 0
in brackets is an index
that indicates that this value is associated with time step 0.
Now we'll use a for loop to save the results of the simulation. I'll start one more time with a new State
object.
Here's a for loop that runs 10 steps and stores the results.
Now we can display the results.
A TimeSeries
is a specialized version of a Pandas Series
, so we can use any of the functions provided by Series
, including several that compute summary statistics:
You can read the documentation of Series
here.
Plotting
We can also plot the results like this.
decorate
, which is defined in the modsim
library, adds a title and labels the axes.
savefig()
saves a figure in a file.
The suffix of the filename indicates the format you want. This example saves the current figure in a PDF file.
Exercise: Wrap the code from this section in a function named run_simulation
that takes three parameters, named p1
, p2
, and num_steps
.
It should:
Create a
TimeSeries
object to hold the results.Use a for loop to run
step
the number of times specified bynum_steps
, passing along the specified values ofp1
andp2
.After each step, it should save the number of bikes at Olin in the
TimeSeries
.After the for loop, it should plot the results and
Decorate the axes.
To test your function:
Create a
State
object with the initial state of the system.Call
run_simulation
with appropriate parameters.Save the resulting figure.
Optional:
Extend your solution so it creates two
TimeSeries
objects, keeps track of the number of bikes at Olin and at Wellesley, and plots both series at the end.
Opening the hood
The functions in modsim.py
are built on top of several widely-used Python libraries, especially NumPy, SciPy, and Pandas. These libraries are powerful but can be hard to use. The intent of modsim.py
is to give you the power of these libraries while making it easy to get started.
In the future, you might want to use these libraries directly, rather than using modsim.py
. So we will pause occasionally to open the hood and let you see how modsim.py
works.
You don't need to know anything in these sections, so if you are already feeling overwhelmed, you might want to skip them. But if you are curious, read on.
Pandas
This chapter introduces two objects, State
and TimeSeries
. Both are based on the Series
object defined by Pandas, which is a library primarily used for data science.
You can read the documentation of the Series
object here
The primary differences between TimeSeries
and Series
are:
I made it easier to create a new, empty
Series
while avoiding a confusing inconsistency.I provide a function so the
Series
looks good when displayed in Jupyter.I provide a function called
set
that we'll use later.
State
has all of those capabilities; in addition, it provides an easier way to initialize state variables, and it provides functions called T
and dt
, which will help us avoid a confusing error later.
Pyplot
The plot
function in modsim.py
is based on the plot
function in Pyplot, which is part of Matplotlib. You can read the documentation of plot
here.
decorate
provides a convenient way to call the pyplot
functions title
, xlabel
, and ylabel
, and legend
. It also avoids an annoying warning message if you try to make a legend when you don't have any labelled lines.
NumPy
The flip
function in modsim.py
uses NumPy's random
function to generate a random number between 0 and 1.
You can get the source code for flip
by running the following cell.