Interactive Python in the Jupyter Notebook

Nathan Lloyd

To be covered:

  • How to use Jupyter Notebook
  • Some really basic python programming
  • Some basic data analysis and plotting with pandas and matplotlib

Not covered:

What is Jupyter?

  • Jupyter is a notebook environment with a browser interface that can be used for interactive computing in Python and other languages (I haven't tried other languages).
  • From when you open (or restart) a Jupyter notebook, variable assignments, function declarations, etc are stored and updated as you go.
  • Basic usage:
    • Type some code in a cell
    • Hit <shift><enter> to execute the cell
    • The return value or the string representation of the last line of the cell will be output below the code.

Saving

  • The notebook periodically gets saved automatically, but you can also do File -> Save and Checkpoint
  • You can change the name by clicking on the name (I like to do "Make a copy" periodically to explicitly save older versions)
In [1]:
my_name = 'ari'
In [2]:
my_name
Out[2]:
'ari'
In [3]:
2 + 2
Out[3]:
4
In [4]:
example_variable = 2 + 2
In [5]:
example_variable
Out[5]:
4
In [6]:
example_variable = example_variable + 5
example_variable
Out[6]:
9
  • Other key Jupyter features:
    • tab completion (right now, if you type "exam" and hit <tab>, it will complete with "example_variable")
    • easy access to function documentation
    • syntax highlighting
    • %%time at the beginning of a cell will print out how long it took to run the code

Jupyter editing

  • Jupyter interactive editing is somewhat patterned after the unix editor vi (if you're a nerd and you happen to know what that is).
  • You're either in edit mode or command mode. You can tell you're in edit mode if there is a blinking cursor in the highlighted cell.
  • In edit mode, obviously, you edit the code.
  • In command mode, there are a bunch of shortcut keys. For example, hitting 'm' changes cell type to "Markdown". Hitting 'l' adds line numbers. 'x' cuts (deletes) the cell. 'z' undoes the last cell deletion, but currently it only remembers one deletion back.
  • Common mistake: starting to type while in command mode.
  • More shortcut keys in command mode:
    • <shift><enter> to run the code in the cell
    • 'a' and 'b' create a new cell above or below the current cell
    • 'c', 'x', 'v': copy, cut, paste.
    • 'M' merges the current cell with the one below it.
    • 'r' changes cell type to "Raw", which is useful as a quick way to clear lengthy output: hit 'r' then 'y' to change it back to type "Code"
  • Shortcut keys in command mode:
    • <shift><enter> to run the code in the cell
    • <command>z to undo edits
    • You can highlight a bunch of code and hit <tab> to indent it
    • To split cells, click inside the cell where you want the split and hit <ctrl><shift>-
    • Hit <escape> while in edit mode to switch to command mode.

You can also change cell type using the "Cell" pull-down menu.
By the way, "Markdown" is what I'm using to make these nice text cells (you can even have tables!) - see Markdown Cheatsheet.
"Raw" is a good way to comment out a cell so it won't run.

Coding examples

In [7]:
# Execute me by hitting <shift><enter>
for i in range(7):
    print(i)
0
1
2
3
4
5
6

A couple of really important points:

  • Click inside the parentheses of the range function and hit <shift><tab> to see documentation.
  • range(7) spits out 0 to 6 because python indexing is 0-based. More on that later.
  • Stuff inside the for loop must be indented 4 spaces. Python is very much a stickler on this!
    • (Explanation: the python interpreter considers the code inside the loop to begin after the ":" and end when you stop indenting 4 spaces)

Exercise: Predict in your head what the following code will print out, then try executing it.

In [8]:
for i in range(2):
    print(i)
    print("yo")
print("dude")
0
yo
1
yo
dude

Exercise: below, add an argument to the print function to make it print out "0123456" without the newlines.

Hint: use <shift><tab> trick to see documentation for print function

In [12]:
for i in range(7):
    print(i)
0
1
2
3
4
5
6
In [17]:
s = ""
for i in range(7):
    s=s+str(i)
    print(s)
0
01
012
0123
01234
012345
0123456
In [20]:
for i in range(7):
    print(i,end='')
0123456
In [ ]:
 

Execute cell below to get solution

In [ ]:
%load exercise_print_without_newline.py

Lists, tuples, dictionaries: containers of arbitrary things

  • These are the most important data storage constructs in basic python

List

In [22]:
example_list = ['alpha', 'bravo', 2, print]
In [23]:
type(example_list)
Out[23]:
list
In [24]:
# Print each item in the list

for item in example_list:
    print(item,end="")
alphabravo2<built-in function print>
In [25]:
# Access the second item in the list (0-based indexing!)

example_list[1]
Out[25]:
'bravo'

Tuple

In [26]:
example_tuple = ('alpha', 'bravo', 2, print)
In [27]:
type(example_tuple)
Out[27]:
tuple
In [28]:
for item in example_tuple:
    print(item)
alpha
bravo
2
<built-in function print>
In [29]:
example_tuple[0]
Out[29]:
'alpha'

Tuple is just like list except it's static.
Use lists unless you know you need a tuple.

Dict

In [30]:
example_dict = {'a':63, 7:'a', 'g':print}   # {key:value, key:value, key:value}
In [31]:
type(example_dict)
Out[31]:
dict
In [32]:
example_dict
Out[32]:
{'g': <function print>, 'a': 63, 7: 'a'}
In [33]:
example_dict['a']
Out[33]:
63
In [34]:
example_dict[7]
Out[34]:
'a'
In [35]:
# Raises Exception KeyError if you query a key that is not in the dict
# You can do example_dict.get(key, default_value) to gracefully return, e.g. 0 when the key is missing.

example_dict[15]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-35-1a683601de9e> in <module>()
      2 # You can do example_dict.get(key, default_value) to gracefully return, e.g. 0 when the key is missing.
      3 
----> 4 example_dict[15]

KeyError: 15
In [36]:
example_dict['g']
Out[36]:
<function print>

Common mistake: example_dict[g] would throw an exception. You need those quotes.

In [37]:
# example_dict['g'] is a function, so we can use it like a function

example_dict['g']("you coding, bro?")
you coding, bro?

Strings

In [ ]:
'  A String  '
In [ ]:
type('  A String  ')
In [42]:
# The str type has lots of useful functions

'  A String '.strip().lower()
Out[42]:
'a string'

Exercise

See what each of those functions, .strip() and .lower() do separately.

In [44]:
'   A    String  k   '.strip()
Out[44]:
'A    String  k'
In [41]:
'A String'.lower()
Out[41]:
'a string'
In [48]:
dict
dictionary = {'g':print}
k = str(dictionary['g']('whAt up'))
print(k.lower())
whAt up
none
#
In [ ]:
'Jack' == 'jack'   # comparison of strings
In [ ]:
'Jack'.lower() == 'jack'.lower()   # comparison of strings
In [ ]:
'time' + 'honey'  # can "add" strings.

Exceptions

In [49]:
a = 'FANTASTIC!'
a.lower()
Out[49]:
'fantastic!'
In [50]:
a = 3
a.lower()   # throws Exception
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-50-56f7a079a502> in <module>()
      1 a = 3
----> 2 a.lower()   # throws Exception

AttributeError: 'int' object has no attribute 'lower'

AttributeError is an exception that python throws when it gets this bad input.
Earlier we saw a dict throw the exception KeyError when I queried a key that wasn't present.

Often times an exception get thrown, like, 4 layers deep in function calls, e.g. e(f(g(h(whathaveyou))) and the function h has a problem operating on whathaveyou. In that case you get a traceback that shows the layers, which is long, but you get used to it.

Often the error message is not helpful. Keep calm and google it.

In [51]:
# Remember that crazy list of mixed types from earlier?
example_list = ['alpha', 'bravo', 2, print]
In [52]:
for item in example_list:
    item('works!')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-52-f50cdea796b9> in <module>()
      1 for item in example_list:
----> 2     item('works!')

TypeError: 'str' object is not callable

Here, TypeError is thrown on the first iteration of the for loop because the first item, 'alpha', is not a function, and you can only do something(input) if something is a function (well, there's other things it could be, e.g. a class, but you get the point).

In [ ]:
# try: except: clause to handle exceptions
for item in example_list:
    try:
        item('carl is my therapist')
    except:
        print("Error: {} doesn't work".format(item))
In [ ]:
# Can also get more information on what's causing the exception
for item in example_list:
    try:
        item('carl is my therapist')
    except:
        print("Error: {} doesn't work".format(item))
        print(type(item))

Exercise:

  1. Store a string in a variable, example: color = 'blue'
  2. Make a list containing the functions type, len, and str.upper
  3. Use a for loop to iterate over that list like so: for func in your_list:
  4. In the for loop, execute the current function with the variable you created as input, and print its output: print(func(color))
In [53]:
color = 'blue'
In [54]:
listed = [type, len, str.upper]
In [57]:
for items in listed:
    print(items(color))
<class 'str'>
4
BLUE

Execute cell below to get solution

In [ ]:
%load exercise_list_functions.py

Followup exercise:

  1. Add the string 'nothing' to the list, so that we get an exception thrown when python tries to execute it.
  2. Now put function call inside a try: except: clause to gracefully handle the exception.
In [61]:
listed.append('nothing')
print(listed)
[<class 'type'>, <built-in function len>, <method 'upper' of 'str' objects>, 'nothing', 'nothing', 'nothing', 'nothing']
In [73]:
for func in listed:
    try:
        print(func(color))
    except:
        print('oh shit'.upper(),sep='',end='.')
<class 'str'>
4
BLUE
OH SHIT.OH SHIT.OH SHIT.OH SHIT.OH SHIT.
In [ ]:
 

Execute cell below to get solution

In [ ]:
%load exercise_list_functions_try_except.py

Working with data

Example dataset: housing values for in Boston:
http://archive.ics.uci.edu/ml/datasets/Housing

In [74]:
import pandas as pd
import matplotlib.pyplot as plt
plt.style.use('ggplot')
%matplotlib inline
In [75]:
df = pd.read_csv('housing.txt', sep='\t')
In [76]:
df
Out[76]:
crime zone industrial river nox rooms age distance highway tax pt_ratio bp lstat median_value
0 0.00632 18.0 2.31 0 0.538 6.575 65.2 4.0900 1 296 15.3 396.90 4.98 24.0
1 0.02731 0.0 7.07 0 0.469 6.421 78.9 4.9671 2 242 17.8 396.90 9.14 21.6
2 0.02729 0.0 7.07 0 0.469 7.185 61.1 4.9671 2 242 17.8 392.83 4.03 34.7
3 0.03237 0.0 2.18 0 0.458 6.998 45.8 6.0622 3 222 18.7 394.63 2.94 33.4
4 0.06905 0.0 2.18 0 0.458 7.147 54.2 6.0622 3 222 18.7 396.90 5.33 36.2
5 0.02985 0.0 2.18 0 0.458 6.430 58.7 6.0622 3 222 18.7 394.12 5.21 28.7
6 0.08829 12.5 7.87 0 0.524 6.012 66.6 5.5605 5 311 15.2 395.60 12.43 22.9
7 0.14455 12.5 7.87 0 0.524 6.172 96.1 5.9505 5 311 15.2 396.90 19.15 27.1
8 0.21124 12.5 7.87 0 0.524 5.631 100.0 6.0821 5 311 15.2 386.63 29.93 16.5
9 0.17004 12.5 7.87 0 0.524 6.004 85.9 6.5921 5 311 15.2 386.71 17.10 18.9
10 0.22489 12.5 7.87 0 0.524 6.377 94.3 6.3467 5 311 15.2 392.52 20.45 15.0
11 0.11747 12.5 7.87 0 0.524 6.009 82.9 6.2267 5 311 15.2 396.90 13.27 18.9
12 0.09378 12.5 7.87 0 0.524 5.889 39.0 5.4509 5 311 15.2 390.50 15.71 21.7
13 0.62976 0.0 8.14 0 0.538 5.949 61.8 4.7075 4 307 21.0 396.90 8.26 20.4
14 0.63796 0.0 8.14 0 0.538 6.096 84.5 4.4619 4 307 21.0 380.02 10.26 18.2
15 0.62739 0.0 8.14 0 0.538 5.834 56.5 4.4986 4 307 21.0 395.62 8.47 19.9
16 1.05393 0.0 8.14 0 0.538 5.935 29.3 4.4986 4 307 21.0 386.85 6.58 23.1
17 0.78420 0.0 8.14 0 0.538 5.990 81.7 4.2579 4 307 21.0 386.75 14.67 17.5
18 0.80271 0.0 8.14 0 0.538 5.456 36.6 3.7965 4 307 21.0 288.99 11.69 20.2
19 0.72580 0.0 8.14 0 0.538 5.727 69.5 3.7965 4 307 21.0 390.95 11.28 18.2
20 1.25179 0.0 8.14 0 0.538 5.570 98.1 3.7979 4 307 21.0 376.57 21.02 13.6
21 0.85204 0.0 8.14 0 0.538 5.965 89.2 4.0123 4 307 21.0 392.53 13.83 19.6
22 1.23247 0.0 8.14 0 0.538 6.142 91.7 3.9769 4 307 21.0 396.90 18.72 15.2
23 0.98843 0.0 8.14 0 0.538 5.813 100.0 4.0952 4 307 21.0 394.54 19.88 14.5
24 0.75026 0.0 8.14 0 0.538 5.924 94.1 4.3996 4 307 21.0 394.33 16.30 15.6
25 0.84054 0.0 8.14 0 0.538 5.599 85.7 4.4546 4 307 21.0 303.42 16.51 13.9
26 0.67191 0.0 8.14 0 0.538 5.813 90.3 4.6820 4 307 21.0 376.88 14.81 16.6
27 0.95577 0.0 8.14 0 0.538 6.047 88.8 4.4534 4 307 21.0 306.38 17.28 14.8
28 0.77299 0.0 8.14 0 0.538 6.495 94.4 4.4547 4 307 21.0 387.94 12.80 18.4
29 1.00245 0.0 8.14 0 0.538 6.674 87.3 4.2390 4 307 21.0 380.23 11.98 21.0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
476 4.87141 0.0 18.10 0 0.614 6.484 93.6 2.3053 24 666 20.2 396.21 18.68 16.7
477 15.02340 0.0 18.10 0 0.614 5.304 97.3 2.1007 24 666 20.2 349.48 24.91 12.0
478 10.23300 0.0 18.10 0 0.614 6.185 96.7 2.1705 24 666 20.2 379.70 18.03 14.6
479 14.33370 0.0 18.10 0 0.614 6.229 88.0 1.9512 24 666 20.2 383.32 13.11 21.4
480 5.82401 0.0 18.10 0 0.532 6.242 64.7 3.4242 24 666 20.2 396.90 10.74 23.0
481 5.70818 0.0 18.10 0 0.532 6.750 74.9 3.3317 24 666 20.2 393.07 7.74 23.7
482 5.73116 0.0 18.10 0 0.532 7.061 77.0 3.4106 24 666 20.2 395.28 7.01 25.0
483 2.81838 0.0 18.10 0 0.532 5.762 40.3 4.0983 24 666 20.2 392.92 10.42 21.8
484 2.37857 0.0 18.10 0 0.583 5.871 41.9 3.7240 24 666 20.2 370.73 13.34 20.6
485 3.67367 0.0 18.10 0 0.583 6.312 51.9 3.9917 24 666 20.2 388.62 10.58 21.2
486 5.69175 0.0 18.10 0 0.583 6.114 79.8 3.5459 24 666 20.2 392.68 14.98 19.1
487 4.83567 0.0 18.10 0 0.583 5.905 53.2 3.1523 24 666 20.2 388.22 11.45 20.6
488 0.15086 0.0 27.74 0 0.609 5.454 92.7 1.8209 4 711 20.1 395.09 18.06 15.2
489 0.18337 0.0 27.74 0 0.609 5.414 98.3 1.7554 4 711 20.1 344.05 23.97 7.0
490 0.20746 0.0 27.74 0 0.609 5.093 98.0 1.8226 4 711 20.1 318.43 29.68 8.1
491 0.10574 0.0 27.74 0 0.609 5.983 98.8 1.8681 4 711 20.1 390.11 18.07 13.6
492 0.11132 0.0 27.74 0 0.609 5.983 83.5 2.1099 4 711 20.1 396.90 13.35 20.1
493 0.17331 0.0 9.69 0 0.585 5.707 54.0 2.3817 6 391 19.2 396.90 12.01 21.8
494 0.27957 0.0 9.69 0 0.585 5.926 42.6 2.3817 6 391 19.2 396.90 13.59 24.5
495 0.17899 0.0 9.69 0 0.585 5.670 28.8 2.7986 6 391 19.2 393.29 17.60 23.1
496 0.28960 0.0 9.69 0 0.585 5.390 72.9 2.7986 6 391 19.2 396.90 21.14 19.7
497 0.26838 0.0 9.69 0 0.585 5.794 70.6 2.8927 6 391 19.2 396.90 14.10 18.3
498 0.23912 0.0 9.69 0 0.585 6.019 65.3 2.4091 6 391 19.2 396.90 12.92 21.2
499 0.17783 0.0 9.69 0 0.585 5.569 73.5 2.3999 6 391 19.2 395.77 15.10 17.5
500 0.22438 0.0 9.69 0 0.585 6.027 79.7 2.4982 6 391 19.2 396.90 14.33 16.8
501 0.06263 0.0 11.93 0 0.573 6.593 69.1 2.4786 1 273 21.0 391.99 9.67 22.4
502 0.04527 0.0 11.93 0 0.573 6.120 76.7 2.2875 1 273 21.0 396.90 9.08 20.6
503 0.06076 0.0 11.93 0 0.573 6.976 91.0 2.1675 1 273 21.0 396.90 5.64 23.9
504 0.10959 0.0 11.93 0 0.573 6.794 89.3 2.3889 1 273 21.0 393.45 6.48 22.0
505 0.04741 0.0 11.93 0 0.573 6.030 80.8 2.5050 1 273 21.0 396.90 7.88 11.9

506 rows × 14 columns

In [77]:
fig, ax = plt.subplots(1, 1)
ax.scatter(x=df['crime'], y=df['median_value'])
ax.set_xlabel('crime per capita')
ax.set_ylabel('median value / $1000');
In [78]:
fig, ax = plt.subplots(1, 1)
ax.scatter(x=df['nox'], y=df['median_value'])
ax.set_xlabel('Nitric oxides concentration (parts per 10M)')
ax.set_ylabel('median value / $1000')
Out[78]:
<matplotlib.text.Text at 0x7f1aff145630>
In [79]:
import numpy as np
In [80]:
fig, ax = plt.subplots(1, 1)
ax.scatter(x=df['nox'], y=df['median_value'], s=3*df['rooms']**2, c=np.sqrt(df['crime']), alpha=0.25, cmap='viridis')
ax.set_xlabel('Nitric oxides concentration (parts per 10M)')
ax.set_ylabel('median value / $1000')
Out[80]:
<matplotlib.text.Text at 0x7f1aff19e8d0>
In [ ]:
df['median_value'].hist()