Lecture 4 – Strings and Arrays
DSC 10, Fall 2022
Announcements
Lab 1 is released and is due Saturday at 11:59PM.
Homework 1 is released and is due Tuesday at 11:59PM.
Finish the lab before you work on the homework!
Issues with DataHub? See here. Issues with Gradescope? See here. Other issues? Post on EdStem.
Agenda
Strings.
Lists.
Arrays.
Ranges.
Resources
We're covering a lot of content very quickly. If you're overwhelmed, just know that we're here to support you!
Office hours and EdStem are your friends 🤝.
Remember to check the Resources tab of the course website for programming resources.
Strings
Strings
A string is a snippet of text of any length.
In Python, strings are enclosed by either single quotes or double quotes.
String arithmetic
When using the + symbol between two strings, the operation is called "concatenation".
String methods
Strings are associated with certain functions called string methods.
Access string methods with a
.after the string (dot notation).For instance, to use the
uppermethod on strings, we writes.upper().
Examples include
upper,title, andreplace.
Special characters in strings
Single quotes and double quotes are usually interchangeable, except when the string itself contains a single or double quote.
Aside: print
By default Jupyter notebooks display the "raw" value of the expression of the last line in a cell.
The
printfunction displays the value in human readable text when it's evaluated.
Type conversion to and from strings
Any value can be converted to a string using
str.Some strings can be converted to
intandfloat.
Concept Check ✅ – Answer at cc.dsc10.com
Assume you have run the following statements:
Choose the expression that will be evaluated without an error.
A. x + y
B. x + int(y + z)
C. str(x) + int(y)
D. str(x) + z
E. All of them have errors
Lists
Motivation
How would we store the temperatures for each of the first 6 days in the month of September?
Our best solution right now is to create a separate variable for each day.
This technically allows us to do things like compute the average temperature through the first 6 days:
Imagine a whole month's data, or a whole year's data. It seems like we need a better solution.
Lists in Python
In Python, a list is used to store multiple values in a single value/variable. To create a new list from scratch, we use [square brackets].
Notice that the elements in a list don't need to be unique!
Lists make working with sequences easy!
To find the average temperature, we just need to divide the sum of the temperatures by the number of temperatures recorded:
Types
The type of a list is... list.
Within a list, you can store elements of different types.
There's a problem...
Lists are very slow.
This is not a big deal when there aren't many entries, but it's a big problem when there are millions or billions of entries.
Arrays
NumPy
NumPy (pronounced "num pie") is a Python library (module) that provides support for arrays and operations on them.
The
babypandaslibrary, which you will learn about next week, goes hand-in-hand with NumPy.NumPy is used heavily in the real world.
To use
numpy, we need to import it. It's usually imported asnp(but doesn't have to be!)
Arrays
Think of NumPy arrays (just "arrays" from now on) as fancy, faster lists.

To create an array, we pass a list as input to the np.array function.
Positions
When people stand in a line, each person has a position.

Similarly, each element of an array (and list) has a position.
Accessing elements by position
Python, like most programming languages, is "0-indexed."
This means that the position of the first element in an array is 0, not 1.
One reason: an element's position represents the number of elements in front of it.
To access the element in array
arr_nameat positionpos, we use the syntaxarr_name[pos].
Types
Earlier in the lecture, we saw that lists can store elements of multiple types.
This is not true of arrays – all elements in an array must be of the same type.
Array-number arithmetic
Arrays make it easy to perform the same operation to every element. This behavior is formally known as "broadcasting".
Note: In none of the above cells did we actually modify temperature_array! Each of those expressions created a new array.
To actually change temperature_array, we need to reassign it to a new array.
Element-wise arithmetic
We can apply arithmetic operations to multiple arrays, provided they have the same length.
The result is computed element-wise, which means that the arithmetic operation is applied to one pair of elements from each array at a time.
For example,
a + bis an array whose first element is the sum of the first element ofaand first element ofb.
Example: TikTok views 🎬
Baby Panda made a series five TikTok videos called "A Day In the Life of a Data Science Mascot". The number of views they've received on these videos are stored in the array views below.
Some questions:
What was their average view count?
How many views did their most and least popular videos receive?
How many views above average did each of their videos receive? How many views above average did their most viewed video receive?
It has been estimated that TikTok pays their creators $0.03 per 1000 views. If this is true, how many dollars did Baby Panda earn on their most viewed video?
Ranges
Motivation
We often find ourselves needing to make arrays like this:
There needs to be an easier way to do this!
Ranges
A range is an array of evenly spaced numbers. We create ranges using
np.arange.The most general way to create a range is
np.arange(start, end, step). This returns an array such that:The first number is
start. By default,startis 0.All subsequent numbers are spaced out by
step, until (but excluding)end. By default,stepis 1.
Activity
🎉 Congrats! 🎉 You won the lottery 💰. Here's how your payout works: on the first day of September, you are paid $0.01. Every day thereafter, your pay doubles, so on the second day you're paid $0.02, on the third day you're paid $0.04, on the fourth day you're paid $0.08, and so on.
September has 30 days.
Write a one-line expression that uses the numbers 2 and 30, along with the function np.arange and the method .sum(), that computes the total amount in dollars you will be paid in September.
Summary, next time
Summary
Strings are used to store text. Enclose them in single or double quotes.
Lists and arrays are used to store sequences.
Arrays are faster and more convenient for numerical operations.
You can easily perform numerical operations on all elements of an array and perform operations on multiple arrays.
Ranges are arrays of equally-spaced numbers.
Remember to refer to the resources from the start of lecture!
Next time
We'll learn about how to use Python to work with real-world tabular data.