Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.
Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.
Path: blob/master/chapters/chap01.ipynb
Views: 531
Printed and electronic copies of Modeling and Simulation in Python are available from No Starch Press and Bookshop.org and Amazon.
Modeling
Modeling and Simulation in Python
Copyright 2021 Allen Downey
License: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International
Jupyter
Welcome to Modeling and Simulation in Python, and welcome to Jupyter.
This is a Jupyter notebook, which is a development environment where you can write and run Python code. Each notebook is divided into cells. Each cell contains either text (like this cell) or Python code.
To run a cell, hold down SHIFT and press ENTER.
If you run a text cell, Jupyter formats the text and displays the result.
If you run a code cell, Jupyter runs the Python code in the cell and displays the result, if any.
The following cells check whether the libraries we need are installed. If so, the cells produce no output. If not, you'll see updates from the installer.
This chapter introduces the modeling framework we will use throughout the book, and works through our first example, using a simple model of physics to evaluate the claim that a penny falling from the height of the Empire State Building could kill someone if it hit them on the head. Also, you'll see how to do computation in Python with units like meters and seconds.
This chapter is available as a Jupyter notebook where you can read the text, run the code, and work on the exercises. Click here to access the notebooks: https://allendowney.github.io/ModSimPy/.
The Modeling Framework
This book is about modeling and simulating physical systems. The following diagram shows what I mean by modeling:
Starting in the lower left, the system is something in the real world we are interested in. To model the system, we have to decide which elements of the real world to include and which we can leave out. This process is called abstraction.
The result of abstraction is a model, which is a description of the system that includes only the features we think are essential. A model can be represented in the form of diagrams and equations, which can be used for mathematical analysis. It can also be implemented in the form of a computer program, which can run simulations.
The result of analysis and simulation might be a prediction about what the system will do, an explanation of why it behaves the way it does, or a design intended to achieve a purpose.
We can validate predictions and test designs by taking measurements from the real world and comparing the data we get with the results from analysis and simulation.
For any physical system, there are many possible models, each one including and excluding different features, or including different levels of detail. The goal of the modeling process is to find the model best suited to its purpose (prediction, explanation, or design).
Sometimes the best model is the most detailed. If we include more features, the model is more realistic, and we expect its predictions to be more accurate. But often a simpler model is better. If we include only the essential features and leave out the rest, we get models that are easier to work with, and the explanations they provide can be clearer and more compelling.
As an example, suppose someone asks you why the orbit of the Earth is elliptical. If you model the Earth and Sun as point masses (ignoring their actual size), compute the gravitational force between them using Newton's law of universal gravitation, and compute the resulting orbit using Newton's laws of motion, you can show that the result is an ellipse. Of course, the actual orbit of Earth is not a perfect ellipse, because of the gravitational forces of the Moon, Jupiter, and other objects in the solar system, and because Newton's laws of motion are only approximately true (they don't take into account relativistic effects). But adding these features to the model would not improve the explanation; more detail would only be a distraction from the fundamental cause. However, if the goal is to predict the position of the Earth with great precision, including more details might be necessary.
Choosing the best model depends on what the model is for. It is usually a good idea to start with a simple model, even if it is likely to be too simple, and test whether it is good enough for its purpose. Then you can add features gradually, starting with the ones you expect to be most essential. This process is called iterative modeling.
Comparing results of successive models provides a form of internal validation, so you can catch conceptual, mathematical, and software errors. And by adding and removing features, you can tell which ones have the biggest effect on the results, and which can be ignored.
Comparing results to data from the real world provides external validation, which is generally the strongest test.
The modeling framework is pretty abstract; the following example might make it clearer.
The Falling Penny Myth
You might have heard that a penny dropped from the top of the Empire State Building would be going so fast when it hit the pavement that it would be embedded in the concrete; or if it hit a person, it would break their skull.
We can test this myth by making and analyzing two models. For the first model, we'll assume that the effect of air resistance is small. In that case, the primary force acting on the penny is gravity, which causes the penny to accelerate downward.
If the initial velocity is 0 and the acceleration, , is constant, the velocity after seconds is
and the distance the penny has dropped is
To find the time until the penny reaches the sidewalk, we can solve for :
Plugging in the acceleration of gravity, m/s, and the height of the Empire State Building, m, we get s.
Then computing we get a velocity on impact of m/s, which is about 190 miles per hour. That sounds like it could hurt.
Of course, these results are not exact because the model is based on simplifications. For example, we assume that gravity is constant. In fact, the force of gravity is different on different parts of the globe, and it gets weaker as you move away from the surface. But these differences are small, so ignoring them is probably a good choice for this problem.
On the other hand, ignoring air resistance is not a good choice, because in this scenario its effect is substantial. Once the penny gets to about 29 m/s, the upward force of air resistance equals the downward force of gravity, so the penny stops accelerating. This is the terminal velocity of the penny in air.
And that suggests a second model, where the penny accelerates until it reaches terminal velocity; after that, acceleration is 0 and velocity is constant. In this model, the penny hits the sidewalk at about 29 m/s. That's much less than 86 m/s, which is what the first model predicts. Getting hit with a penny at that speed might hurt, but it would be unlikely to cause real harm. And it would not damage concrete.
The statistician George Box famously said "All models are wrong, but some are useful." He was talking about statistical models, but his wise words apply to all kinds of models. Our first model, which ignores air resistance, is very wrong, and probably not useful. The second model, which takes air resistance into account, is still wrong, but it's better, and it's good enough to refute the myth.
The television show Mythbusters has tested the myth of the falling penny more carefully; you can view the results at https://modsimpy.com/myth/. Their work is based on a mathematical model of motion, measurements to determine the force of air resistance on a penny, and a physical model of a human head.
Computation In Python
Let me show you how I computed the results from the previous section using Python. First we'll create a variable to represent acceleration due to gravity in meters per second squared (m/s).
A variable is a name that corresponds to a value. In this example, the name is a
and the value is the number 9.8
.
Suppose we let the penny drop for seconds (s). I'll create a variable to represent this time:
Now we can compute the velocity of the penny after t
seconds.
Python uses the symbol *
for multiplication. The other arithmetic operators are +
for addition, -
for subtraction, /
for division, and **
for exponentiation.
After you assign a value to a variable, you can display the value like this:
After s, the velocity of the penny is about m/s (ignoring air resistance). Now let's see how far it would travel during that time:
It would travel about m. Now, going in the other direction, let's compute the time it takes to fall 381 m, the height of the Empire State Building.
For this computation, we need the square root function, sqrt
, which is provided by a library called NumPy. Before we can use it, we have to import it like this:
Now we can use it like this:
With no air resistance, it would take about s for the penny to reach the sidewalk.
And its velocity on impact would be about m/s.
False Precision
Python displays results with about 16 digits, which gives the impression that they are very precise, but don't be fooled. The numbers we get out are only as good as the numbers we put in.
For example, the "roof height" of the Empire State Building is m (according to Wikipedia: https://en.wikipedia.org/wiki/Empire_State_Building). I chose m for this example on the assumption that the observation deck is on the roof and you drop the penny from a 1 meter railing. But that's probably not right, so we should treat this value as an approximation where only the first two digits are likely to be right.
If the precision of the inputs is two digits, the precision of the outputs is two digits, at most. That's why, if the output is 86.41527642726142
, I report it as "about 86".
Computation With Units
The computations we just did use numbers without units. When we set h=381
, we left out the meters, and when we set a=9.8
, we left out the meters per second squared. And, when we got the result v=86
, we added back the meters per second.
Leaving units out of computation is a common practice, but it tends to cause errors, including the very expensive failure of the Mars Climate Orbiter in 1999 (see https://en.wikipedia.org/wiki/Mars_Climate_Orbiter). When possible, it is better to include units in the computation.
To represent units, we'll use a library called Pint (https://pint.readthedocs.io/). To use it, we have to import a function called UnitRegistry
and call it like this:
The result is an object that contains variables representing pretty much every unit you've heard of. For example:
But leagues and fortnights are not part of the International System of Units (see https://en.wikipedia.org/wiki/International_System_of_Units); in the jargon, they are not "SI units". Instead, we will use meter
and second
.
To find out what other units are defined, type units.
(including the period) in the next cell.
If you are on Colab, a pop-up menu should appear with a list of units. In other Jupyter environments, you might have to press TAB
to get the menu.
We can use meter
and second
to create a variable named a
and give it the value of acceleration due to gravity.
The result is a quantity with two parts, called magnitude
and units
, which we can access like this:
We can also create a quantity that represents s.
And use it to compute the distance a penny would fall after t
seconds with constant acceleration a
.
Notice that the units of the result are correct. If we create a quantity to represent the height of the Empire State Building:
We can use it to compute the time the penny would take to reach the sidewalk.
And the velocity of the penny on impact:
As in the previous section, the result is about , but now it has the correct units, m/s.
With Pint quantities, we can convert from one set of units to another like this:
If you are more familiar with miles per hour, this result might be easier to interpret. And it might give you a sense that this model is not realistic.
Summary
This chapter introduces a modeling framework that consists of three steps:
Abstraction is the process of defining a model by deciding which elements of the real world to include and which can be left out.
Analysis and simulation are ways to use a model to generate predictions, explain why things behave as they do, and design things that behave as we want.
Validation is how we test whether the model is right, often by comparing predictions with measurements from the real world.
As a first example, we modeled a penny dropped from the Empire State building, including gravity but ignoring air resistance. In the exercises, you'll have a chance to try a better model, including air resistance.
This chapter also presents Pint, a library for doing computation with units, which is convenient for converting between different units and helpful for avoiding catastrophic errors.
Exercises
Before you go on, you might want to work on the following exercises.
Exercise 1
In mathematical notation, we can write an equation like and it's understood that we are multiplying and . But that doesn't work in Python. If you put two variables side-by-side, like this:
you'll get a syntax error, which means that something is wrong with the structure of the program. Try it out so you see what the error message looks like.
Exercise 2
In this chapter we used the sqrt
function from the NumPy library. NumPy also provides a variable named pi
that contains an approximation of the mathematical constant . We can import it like this:
NumPy provides other functions we'll use, including log
, exp
, sin
, and cos
. Import sin
and cos
from NumPy and compute
Note: A mathematical identity tells us that the answer should be .
Exercise 3
Suppose you bring a 10 foot pole to the top of the Empire State Building and use it to drop the penny from h
plus 10 feet.
Define a variable named foot
that contains the unit foot
provided by the UnitRegistry
we called units
. Define a variable named pole_height
and give it the value 10 feet.
What happens if you add h
, which is in units of meters, to pole_height
, which is in units of feet? What happens if you write the addition the other way around?
Exercise 4
Why would it be nonsensical to add a
and t
? What happens if you try?
In this example, you should get a DimensionalityError
, which indicates that you have violated a rule of dimensional analysis: you cannot add quantities with different dimensions.
The error messages you get from Python are big and scary, but if you read them carefully, they contain a lot of useful information.
The last line usually tells you what type of error happened, and sometimes additional information, so you might want to start from the bottom and read up.
The previous lines are a traceback of what was happening when the error occurred. The first section of the traceback shows the code you wrote. The following sections are often from Python libraries.
Before you go on, you might want to delete the erroneous code so the notebook can run without errors.
Exercise 5
Suppose instead of dropping the penny, you throw it downward at its terminal velocity, m/s. How long would it take to fall m?
Exercise 6:
So far we have considered two models of a falling penny:
If we ignore air resistance, the penny falls with constant acceleration, and we can compute the time to reach the sidewalk and the velocity of the penny when it gets there.
If we take air resistance into account, and drop the penny at its terminal velocity, it falls with constant velocity.
Now let's consider a third model that includes elements of the first two: let's assume that the acceleration of the penny is a
until the penny reaches m/s, and then m/s afterwards. What is the total time for the penny to fall m?
You can break this question into three parts:
How long would the penny take to reach m/s with constant acceleration
a
.How far would it fall during that time?
How long would it take to fall the remaining distance with constant velocity m/s?
Suggestion: Assign each intermediate result to a variable with a meaningful name. And assign units to all quantities!
Exercise 7
When I was in high school, the pitcher on the baseball team claimed that when he threw a fastball he was throwing the ball down; that is, the ball left his hand at a downward angle. I was skeptical; watching from the side, I thought the ball left his hand at an upward angle.
Can you think of a simple model you could use to settle the argument? What factors would you include and what could you ignore? What quantities would you have to look up or estimate?
I suggest you convert all quantities to SI units like meters and seconds (see https://en.wikipedia.org/wiki/International_System_of_Units).
Exercise 8
Suppose I run a 10K race in 44:52 (44 minutes and 52 seconds). What is my average pace in minutes per mile?
More Jupyter
You can add and remove cells from a notebook using the buttons in the toolbar and the items in the menu, both of which you should see at the top of this notebook.
Try the following exercises:
From the Insert menu select "Insert cell below" to add a cell below this one. By default, you get a code cell, as you can see in the pulldown menu that says "Code".
In the new cell, add a print statement like
print('Hello')
, and run it.Add another cell, select the new cell, and then click on the pulldown menu that says "Code" and select "Markdown". This makes the new cell a text cell.
In the new cell, type some text, and then run it.
Use the arrow buttons in the toolbar to move cells up and down.
Use the cut, copy, and paste buttons to delete, add, and move cells.
As you make changes, Jupyter saves your notebook automatically, but if you want to make sure, you can press the save button, which looks like a floppy disk from the 1990s.
Finally, when you are done with a notebook, select "Close and Halt" from the File menu.
When you change the contents of a cell, you have to run it again for those changes to have an effect. If you forget to do that, the results can be confusing, because the code you are looking at is not the code you ran.
If you ever lose track of which cells have run, and in what order, you should go to the Kernel menu and select "Restart & Run All". Restarting the kernel means that all of your variables get deleted, and running all the cells means all of your code will run again, in the right order.
Select "Restart & Run All" now and confirm that it runs all of the cells.