GitHub Repository: dsc-courses/dsc10-2022-fa
Path: blob/main/lectures/lec11/lec11.ipynb
³⁰⁵⁸ views

Kernel: Python 3 (ipykernel)

In [ ]:

# Set up packages for lecture. Don't worry about understanding this code, but
# make sure to run it if you're following along.
import numpy as np
import babypandas as bpd
import pandas as pd
from matplotlib_inline.backend_inline import set_matplotlib_formats
import matplotlib.pyplot as plt
%reload_ext pandas_tutor
%set_pandas_tutor_options {'projectorMode': True}
set_matplotlib_formats("svg")
plt.style.use('fivethirtyeight')

np.set_printoptions(threshold=20, precision=2, suppress=True)
pd.set_option("display.max_rows", 7)
pd.set_option("display.max_columns", 8)
pd.set_option("display.precision", 2)

Lecture 11 – Conditional Statements and Iteration

DSC 10, Fall 2022

Announcements

Homework 3 is due tomorrow at 11:59PM.
Lab 4 is due on Saturday, 10/22 at 11:59PM.
The Midterm Project will be released Wednesday!
- Partners are not required, but strongly encouraged.
- Your partner doesn't have to be from your lecture section.
- Before or after discussion today, we'll host a mixer to help you find a partner! See this post on EdStem for details.
- You must use the pair programming model when working with a partner.
If you have a conflict with your assigned discussion, email TA Dasha ([email protected]) to request to attend another.
Look at the Grade Report on Gradescope to see your scores on all assignments, discussion attendance, and number of used slip days so far.

Agenda

Booleans.
Conditional statements (i.e. if-statements).
Iteration (i.e. for-loops).

Note:

We've finished introducing new DataFrame manipulation techniques.
Today we'll cover some foundational programming tools, which will be very relevant as we start to cover more ideas in statistics (next week).

Booleans

Recap: Booleans

bool is a data type in Python, just like int, float, and str.
- It stands for "Boolean", named after George Boole, an early mathematician.
There are only two possible Boolean values: True or False.
- Yes or no.
- On or off.
- 1 or 0.
Comparisons result in Boolean values.

In [ ]:

capstone = 'finished'
units = 123

In [ ]:

units >= 180

In [ ]:

type(units >= 180)

Boolean operators; `not`

There are three operators that allow us to perform arithmetic with Booleans – not, and, and or.

not flips True ↔️ False.

In [ ]:

capstone

In [ ]:

capstone == 'finished'

In [ ]:

not capstone == 'finished'

The `and` operator

The and operator is placed between two bools. It is True if both are True; otherwise, it's False.

In [ ]:

capstone

In [ ]:

units

In [ ]:

capstone == 'finished' and units >= 180

In [ ]:

capstone == 'finished' and units >= 120

The `or` operator

The or operator is placed between two bools. It is True if at least one is True; otherwise, it's False.

In [ ]:

capstone

In [ ]:

units

In [ ]:

capstone == 'finished' or units >= 180

In [ ]:

# Both are True!
capstone == 'finished' or units >= 0

In [ ]:

# Both are False!
capstone == 'not started' or units >= 180

Order of operations

By default, the order of operations is not, and, or. See the precedence of all operators in Python here.
As usual, use (parentheses) to make expressions more clear.

In [ ]:

capstone

In [ ]:

units

In [ ]:

capstone == 'finished' or (capstone == 'in progress' and units >= 180)

In [ ]:

# Different meaning!
(capstone == 'finished' or capstone == 'in progress') and units >= 180

In [ ]:

# "and" has precedence.
capstone == 'finished' or capstone == 'in progress' and units >= 180

Booleans can be tricky!

For instance, not (a and b) is different than not a and not b! If you're curious, read more about De Morgan's Laws.

In [ ]:

capstone

In [ ]:

units

In [ ]:

not (capstone == 'finished' and units >= 180)

In [ ]:

(not capstone == 'finished') and (not units >= 180)

Note: `&` and `|` vs. `and` and `or`

Use the & and | operators between two Series. Arithmetic will be done element-wise (separately for each row).
- This is relevant when writing DataFrame queries, e.g. df[(df.get('capstone') == 'finished') & (df.get('units') >= 180)].

Use the and and or operators between two individual Booleans.
- e.g. capstone == 'finished' and units >= 180.

Concept Check ✅ – Answer at cc.dsc10.com

Suppose we define a = True and b = True. What does the following expression evaluate to?

not (((not a) and b) or ((not b) or a))

A. True

B. False

C. Could be either one

Aside: the `in` operator

Sometimes, we'll want to check if a particular element is in a list/array, or a particular substring is in a string. The in operator can do this for us:

In [ ]:

3 in [1, 2, 3]

In [ ]:

'hey' in 'hey my name is'

In [ ]:

'dog' in 'hey my name is'

Conditionals

`if`-statements

Often, we'll want to run a block of code only if a particular conditional expression is True.
The syntax for this is as follows (don't forget the colon!):

if <condition>:
    <body>

Indentation matters!

In [ ]:

capstone = 'finished'
capstone

In [ ]:

if capstone == 'finished':
    print('Looks like you are ready to graduate!')

`else`

else: Do something else if the specified condition is False.

In [ ]:

capstone = 'finished'
capstone

In [ ]:

if capstone == 'finished':
    print('Looks like you are ready to graduate!')
else:
    print('Before you graduate, you need to finish your capstone project.')

`elif`

What if we want to check more than one condition? Use elif.
elif: if the specified condition is False, check the next condition.
If that condition is False, check the next condition, and so on, until we see a True condition.
- After seeing a True condition, it evaluates the indented code and stops.
If none of the conditions are True, the else body is run.

In [ ]:

capstone = 'in progress'
units = 123

In [ ]:

if capstone == 'finished' and units >= 180:
    print('Looks like you are ready to graduate!')
elif capstone != 'finished' and units < 180:
    print('Before you graduate, you need to finish your capstone project and take', 180 - units, 'more units.')
elif units >= 180:
    print('Before you graduate, you need to finish your capstone project.')
else:
    print('Before you graduate, you need to take', 180 - units, 'more units.')

What if we use if instead of elif?

In [ ]:

if capstone == 'finished' and units >= 180:
    print('Looks like you are ready to graduate!')
if capstone != 'finished' and units < 180:
    print('Before you graduate, you need to finish your capstone project and take', 180 - units, 'more units.')
if units >= 180:
    print('Before you graduate, you need to finish your capstone project.')
else:
    print('Before you graduate, you need to take', 180 - units, 'more units.')

Example: Percentage to letter grade

Below, complete the implementation of the function, grade_converter, which takes in a percentage grade (grade) and returns the corresponding letter grade, according to this table:

Letter	Range
A	[90, 100]
B	[80, 90)
C	[70, 80)
D	[60, 70)
F	[0, 60)

Your function should work on these examples:

>>> grade_converter(84)
'B'

>>> grade_converter(60)
'D'

In [ ]:

def grade_converter(grade):
    if grade >= 90:
        return 'A'
    elif grade >= 80:
        return 'B'
    elif grade >= 70:
        return 'C'
    elif grade >= 60:
        return 'D'
    else:
        return 'F'

In [ ]:

grade_converter(84)

In [ ]:

grade_converter(60)

Activity


def mystery(a, b):
    if (a + b > 4) and (b > 0):
        return 'bear'
    elif (a * b >= 4) or (b < 0):
        return 'triton'
    else:
        return 'bruin'

Without running code:

What does mystery(2, 2) return?
Find inputs so that calling mystery will produce 'bruin'.

In [ ]:

def mystery(a, b):
    if (a + b > 4) and (b > 0):
        return 'bear'
    elif (a * b >= 4) or (b < 0):
        return 'triton'
    else:
        return 'bruin'

In [ ]:

Iteration

`for`-loops

In [ ]:

import time

print('Launching in...')

for x in [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]:
    print('t-minus', x)
    time.sleep(0.5) # Pauses for half a second
    
print('Blast off! 🚀')

`for`-loops

Loops allow us to repeat the execution of code. There are two types of loops in Python; the for-loop is one of them.
The syntax of a for-loop is as follows:

for <element> in <sequence>:
    <for body>

Read this as: "for each element of this sequence, repeat this code."
- Note: lists, arrays, and strings are all examples of sequences.
Like with if-statements, indentation matters!

Example: Squares

In [ ]:

num = 4
print(num, 'squared is', num ** 2)

num = 2
print(num, 'squared is', num ** 2)

num = 1
print(num, 'squared is', num ** 2)

num = 3
print(num, 'squared is', num ** 2)

In [ ]:

# The loop variable can be anything!

list_of_numbers = [4, 2, 1, 3]

for num in list_of_numbers:
    print(num, 'squared is', num ** 2)

The line print(num, 'squared is', num ** 2) is run four times:

On the first iteration, num is 4.
On the second iteration, num is 2.
On the third iteration, num is 1.
On the fourth iteration, num is 3.

This happens, even though there is no num = anywhere.

Activity

Using the array colleges, write a for-loop that prints:

Revelle College
John Muir College
Thurgood Marshall College
Earl Warren College
Eleanor Roosevelt College
Sixth College
Seventh College

Click here to see the solution after you've tried it yourself.


for college in colleges:
    print(college + ' College')

In [ ]:

colleges = np.array(['Revelle', 'John Muir', 'Thurgood Marshall', 
            'Earl Warren', 'Eleanor Roosevelt', 'Sixth', 'Seventh'])

In [ ]:

...

Ranges

Recall, each element of a list/array has a numerical position.
- The position of the first element is 0, the position of the second element is 1, etc.
We can write a for-loop that accesses each element in an array by using its position.
np.arange will come in handy.

In [ ]:

actions = np.array(['ate', 'slept', 'exercised'])
feelings = np.array(['content 🙂', 'energized 😃', 'exhausted 😩'])

In [ ]:

len(actions)

In [ ]:

for i in np.arange(len(actions)):
    print(i)

In [ ]:

for i in np.arange(len(actions)):
    print('I', actions[i], 'and I felt', feelings[i])

Example: Goldilocks and the Three Bears

We don't have to use the loop variable!

In [ ]:

for i in np.arange(3):
    print('🐻')
print('👧🏼')

Randomization and iteration

In the next few lectures, we'll learn how to simulate random events, like flipping a coin.

Often, we will:
1. Run an experiment, e.g. "flip 10 coins."
2. Keep track of some result, e.g. "number of heads."
3. Repeat steps 1 and 2 many, many times using a for-loop.

Storing the results

To store our results, we'll typically use an int or an array.
If using an int, we define an int variable (usually to 0) before the loop, then use + to add to it inside the loop.
If using an array, we create an array (usually empty) before the loop, then use np.append to add to it inside the loop.

`np.append`

This function takes two inputs:
- an array
- an element to add on to the end of the array
It returns a new array. It does not modify the input array.
We typically use it like this to extend an array by one element: name_of_array = np.append(name_of_array, element_to_add)
Remember to store the result!

In [ ]:

some_array = np.array([])

In [ ]:

np.append(some_array, 'hello')

In [ ]:

some_array

In [ ]:

# Need to save the new array!
some_array = np.append(some_array, 'hello')
some_array

In [ ]:

some_array = np.append(some_array, 'there')
some_array

Example: Coin flipping

The function flip(n) flips n fair coins and returns the number of heads it saw. (Don't worry about how it works for now.)

In [ ]:

def flip(n):
    '''Returns the number of heads in n simulated coin flips, using randomness.'''
    return np.random.multinomial(n, [0.5, 0.5])[0]

In [ ]:

# Run this cell a few times – you'll see different results!
flip(10)

Let's repeat the act of flipping 10 coins, 10000 times.

Each time, we'll use the flip function to flip 10 coins and compute the number of heads we saw.
We'll store these numbers in an array, heads_array.
Every time we use our flip function to flip 10 coins, we'll add an element to the end of heads_array.

In [ ]:

# heads_array starts empty – before the simulation, we haven't flipped any coins!
heads_array = np.array([])

for i in np.arange(10000):
    
    # Flip 10 coins and count the number of heads.
    num_heads = flip(10)
    
    # Add the number of heads seen to heads_array.
    heads_array = np.append(heads_array, num_heads)

Now, heads_array contains 10000 numbers, each corresponding to the number of heads in 10 simulated coin flips.

In [ ]:

heads_array

In [ ]:

len(heads_array)

In [ ]:

(bpd.DataFrame().assign(num_heads=heads_array)
 .plot(kind='hist', density=True, bins=np.arange(0, 12), ec='w', legend=False, 
       title = 'Distribution of the number of heads in 10 coin flips')
);

The accumulator pattern

`for`-loops in DSC 10

Almost every for-loop in DSC 10 will use the accumulator pattern.
- This means we initialize a variable, and repeatedly add on to it within a loop.
Do not use for-loops to perform mathematical operations on every element of an array or Series.
- Instead use DataFrame manipulations and built-in array or Series methods.
Helpful video 🎥: For Loops (and when not to use them) in DSC 10.

Working with strings

String are sequences, so we can iterate over them, too!

In [ ]:

for letter in 'uc san diego':
    print(letter.upper())

In [ ]:

'california'.count('a')

Example: Vowel count

Below, complete the implementation of the function vowel_count, which returns the number of vowels in the input string s (including repeats). Example behavior is shown below.

>>> vowel_count('king triton')
3

>>> vowel_count('i go to uc san diego')
8

In [ ]:

def vowel_count(s):
    
    # We need to keep track of the number of vowels seen so far. Before we start, we've seen zero vowels.
    number = 0
    
    # For each of the 5 vowels:
    for vowel in 'aeiou':
       
        # Count the number of occurrences of this vowel in s.
        count = s.count(vowel)
        
        # Add this count to the variable number.
        number = number + count
    
    # Once we've gotten through all 5 vowels, return the answer.
    return number

In [ ]:

vowel_count('king triton')

In [ ]:

vowel_count('i go to uc san diego')

Summary, next time

Summary

if-statements allow us to run pieces of code depending on whether certain conditions are True.
for-loops are used to repeat the execution of code for every element of a sequence.
- Lists, arrays, and strings are examples of sequences.

Next time

Probability.
A math lesson – no code!

Lecture 11 – Conditional Statements and Iteration

DSC 10, Fall 2022

Announcements

Agenda

Booleans

Recap: Booleans

Boolean operators; not

The and operator

The or operator

Order of operations

Booleans can be tricky!

Note: & and | vs. and and or

Concept Check ✅ – Answer at cc.dsc10.com

Aside: the in operator

Conditionals

if-statements

else

elif

Example: Percentage to letter grade

Activity

Iteration

for-loops

for-loops

Example: Squares

Activity

Ranges

Example: Goldilocks and the Three Bears

Randomization and iteration

Storing the results

np.append

Example: Coin flipping

The accumulator pattern

for-loops in DSC 10

Working with strings

Example: Vowel count

Summary, next time

Summary

Next time

Boolean operators; `not`

The `and` operator

The `or` operator

Note: `&` and `|` vs. `and` and `or`

Aside: the `in` operator

`if`-statements

`else`

`elif`

`for`-loops

`for`-loops

`np.append`

`for`-loops in DSC 10