ubuntu2004
Lists
The following table tabulates the average diameters of white blood cells in human blood.
Cell type | Diameter (m) |
---|---|
Neutrophil | 11 |
Eosinophil | 11 |
Basophil | 13.5 |
Small lymphocyte | 7.5 |
Large lymphocyte | 13.5 |
Monocyte | 22.5 |
Let's say we wanted to use Python code to calculate the average diameter of white blood cell types. One way to code it would be to have a variable for each diameter, e.g.,
and then find the average and assign it to a numerical variable called average_diameter
like so
That's okay if we have a few values. But if we have hundreds or millions of values in a set of data, coding them like this is tedious, maybe impossible and not re-useable.
Fortunately Python has a data type, called a list, that can store multiple values simultaneously. Lists allow us to write simpler, more efficient and re-useable code.
A list is a sequence of items
Using white blood cells as an example, the following code assigns a sequence of numbers to a list variable called diameters
and assigns a sequence of strings to another list variable called cell_types
. Python knows they are list variables because of the pair of square brackets surrounding the sequences of comma-separated numbers and strings.
The individual values in a list are called elements or items.
Length of a list
The number of items in a list is called its length. Similarly for strings, the length of a list is returned by the function len()
as demonstrated in the following code.
Creating an empty list
When we created the lists diameters
and cell_types
above they were initialised with sequences of numbers and strings respectively.
Sometimes we want to initialise an empty list. To do this we simply use a pair of square brackets with nothing between them like so:
Adding items to a list: append()
To add items to the end of a list we use its append()
method. This is shown in the code below.
Notice that the first time we print the list it is empty. Then we add "Neutrophil", so we have a list with one item. Then we add "Eosinophil" to the end of the list resulting in a list with two items.
We can add as many items as we want to a list.
Sorting a list
Lists can be sorted using either the sorted()
function or the sort()
method. The difference between the two options is that sorted()
will create a new sorted list, leaving the original list intact, whereas sort()
will sort a list in-place, i.e., the original list is modified.
Items can be sorted in descending order by specifying reverse=True
in sort()
or sorted()
like so:
Accessing values in a list
Accessing individual items
Lists are ordered: each item occurs at a position known as its index. (This is just like indicies of characters in a string.)
An item's value can be accessed by using the index for that item. As for strings, the first item of the list is at index 0, the second item is at index 1, and so on. The last item in the list has index -1, the penultimate item has index -2, and so on.
IndexError
If we use an index that is larger than the number of items in the list we get an IndexError: list index out of range
as demonstrated in the following code.
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-14-19da7042051e> in <module>
----> 1 print( cell_types[1000] )
IndexError: list index out of range
Accessing multiple items: slicing
Multiple items can also be accessed simultaneously by specifying a range of indices using slices exactly as for strings. The slice notation is [start index : stop index]
, or [start index : stop index : step size]
for extended slices.
Notice that the type of the variable printed is a list (it has square brackets). So when we access an individual item of a list we get back just that item, whether it is a string or number. When we access a slice of a list we get back a list.
Testing membership of a list
You can also use in
and not in
to test if an item is in a list.
Getting the index of an item
As for strings we can find the index of the first occurrence of an item in a list. With strings we use the find()
method. For lists we use the index()
method like so:
One difference with strings though. If the item we are searching for isn't in the list we get a ValueError
.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-18-244feef246a1> in <module>
----> 1 print( cell_types.index('Leucocyte') )
ValueError: 'Leucocyte' is not in list