Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Download
10327 views
ubuntu2004
Kernel: Python 3 (system-wide)

Loops IV - looping by index

We have so far only concerned ourselves with looping over a single list. Now we consider how to loop over multiple lists simultaneously.

But before we do that we need to introduce a new function called range().

How range() works

In its simplest form range(n) produces a sequence of ascending integers starting from zero and ending at n-1. The following code shows how it works.

# A loop that iterates 3 times. for i in range(3): print( i )
0 1 2

In this example the loop iterates 3 times. The iterating variable is i. On the first iteration i is assigned the value 0, on the second iteration i is assigned the value 1, and on the third and last iteraton i is assigned the value 2.

Using range(n, m) produces a sequence of ascending integers starting at n and ending at m-1. The following code shows how it works.

# A loop that iterates 3 times staring at 5 and ending at 7. for i in range(5, 8): print( i )
5 6 7

Finally, using range(n, m, s) produces a sequence of ascending integers starting at n, ending at m-1 and with a step size s. The following code shows how it works.

# A loop that iterates 4 times staring at 0 and ending at 9 with a step size of 3. for i in range(0, 10, 3): print( i )
0 3 6 9

Using range() to loop through strings

In loops range() is commonly used to access characters in a string using their indices. Remember from Notebook 6 that each character in a string has an index (or position) and we can use that index to access the character from the string. This table shows the indices of the string "Hello, world!" which we saw in Notebook 6.

string:Hello,World!
index:0123456789101112

The following code prints the indicies of all characters in "Hello, world!" using range()

sentence = 'Hello, world!' # Loop over every character in the variable sentence. for i in range( len(sentence) ): print( f'{i}\t{sentence[i]}' )
0 H 1 e 2 l 3 l 4 o 5 , 6 7 w 8 o 9 r 10 l 11 d 12 !

Notice how in the loop,

for i in range( len(sentence) ):

we use the function len(sentence) to get the number of characters in the string "Hello, world!" (13 in total), and then use that length in range(). Instead we could have written

for i in range( 13 ):

but that would mean manually counting the number of characters in sentence first. Why do something when you can get the computer to do it for you?

In the print() function we also used a tab escape character "\t" to format the output into two columns.

A realistic example of looping through a string: Extracting codons

Consecutive triplets of bases in DNA are called codons. Which are translated into amino acids to produce proteins. Let's write some code to extract codons from a DNA sequence.

# Assign a DNA sequence to the string variable dna_seq. dna_seq = 'TTATGTATCCTTATATCACAACTCGAAGATTCTTCTTCTGCA' # Loop through dna_seq in chunks of three bases, i.e., one codon at a time. # Use indicies to split dna_seq into codons. # The indices needed are 0, 3, 6, 9, ... until the end of dna_seq is reached. for i in range(0, len(dna_seq), 3): # Use string slicing to extract the codon. codon = dna_seq[i:i+3] print( f'{i}\t{codon}' )
0 TTA 3 TGT 6 ATC 9 CTT 12 ATA 15 TCA 18 CAA 21 CTC 24 GAA 27 GAT 30 TCT 33 TCT 36 TCT 39 GCA

In range() we have set the step size to 3 so that we index every third position in the DNA sequence, i.e., 0, 3, 6, 9, and and so on.

We use string slicing to access substrings (i.e., the codons) of the DNA sequence. For example, when the iterating variable i is 0 the slice dna_seq[i:i+3] is dna_seq[0:3] which accesses the first to third characters in dna_seq which is "TTA". On the next iteration of the loop i is 3, so we access the characters dna_seq[3:6], i.e., the fourth to sixth characters which are "TGT". And so on.

Using range() to loop through lists

Using range() to loop through a list is the same as using it to loop through a string.

Run the following code to see how it is done using our example of calculating the average diameter of white blood cell types.
diameters = [11, 11, 13.5, 7.5, 13.5, 22.5] # The number of items in the list diameters. n = len( diameters ) # Initialise the sum of values in to zero. sum_of_diameters = 0 # Loop through the list summing the diameters one at a time. for i in range(n): sum_of_diameters += diameters[i] print( f'There are {n} items in the list' ) print( f'The sum of diameters is {sum_of_diameters} micrometers' ) print( f'The average diameter of white blood cell types is {sum_of_diameters/n:.4g} micrometers' )
There are 6 items in the list The sum of diameters is 79.0 micrometers The average diameter of white blood cell types is 13.17 micrometers

Notice how we have used n (the number of items in the list) in range(n).

The only difference in the above code and the code in the Notebook 12 is the loop. In Notebook 12 we had

for d in diameters: sum_of_diameters += d

and in this Notebook we have

for i in range(n): sum_of_diameters += diameters[i]

Both do exactly the same thing. For the simple code we have written here, the first method is probably preferred because it is easier to read and understand. However, for other tasks the latter method may be better. It all depends on the task.

Simultaneous looping through multiple lists

Say we have two lists, one with forenames and another with surnames. Our task is to loop through both lists simultaneously, concatenate (join) each forename and surname and create a new list with each full name.

For example if we start with two lists

forenames = ['Harry', 'Ron', 'Hermione'] surnames = ['Potter', 'Weasley', 'Granger']

By combining the lists forenames and surnames we want to obtain the list

fullnames = ['Harry Potter', 'Ron Weasley', 'Hermione Granger']

Let's write out the algorithm for doing this

  1. Initialise an empty list called fullnames

  2. Loop through the lists forenames and surnames simultaneously so that we have a forename and surname pair.

  3. Concatenate forename and surname into a full name.

  4. Append the full name to the list called fullnames.

Apart from Step 2 we have seen how to do the other steps in Notebooks 5 (strings) and 11 (lists). So how do we loop through two lists simultaneously?

We can use range() to loop through the indicies of the lists. This is shown in the following code

forenames = ['Harry', 'Ron', 'Hermione'] surnames = ['Potter', 'Weasley', 'Granger'] # Initiate an empty list to store the full names. fullnames = [] # Loop through the indicies of the lists of names. # As both lists are the same length we can use the length of one of them in range(). for i in range( len(forenames) ): # Use an f-string to concatenate the forename and surname # and append to the end of the list of fullnames. fullnames.append( f'{forenames[i]} {surnames[i]}' ) print(fullnames)
['Harry Potter', 'Ron Weasley', 'Hermione Granger']

Exercise Notebook

Next Notebook