Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Download
10327 views
ubuntu2004
Kernel: Python 3 (system-wide)

Loops III - controlling and nesting loops

Breaking out of a loop: break

In Exercise 12.3 we looped through all DNA sequences in the list DNA_sequences testing if they contained the start codon "ATG".

Say, for example, we want to loop through the list until we find the first DNA sequence containing a start codon and then we stop looping. We can do this by breaking out of the loop early, i.e., before we reach the end of the list.

The following code shows how this is done.
DNA_sequences = ["GCTACGCTGGC", "ATGCACGACGT", "TAAGCCGGTAG", "AGTTGGAAATC"] # This code looks for the first DNA sequence with a start codon ATG in a list. # So that we know such a sequence has been found we assign an empty string to the variable called found. # If no such sequence has been found, found will still be empty after looping through the list. # If a sequence is found then found will be assigned that sequence. found = '' # Loop through DNA_sequences one sequence at a time. for dna_seq in DNA_sequences: # Test if the start codon ATG is in the current DNA sequence. # If it is, assign the sequence to found and immediately # break out of the loop without testing the remaining DNA sequences. # If it isn't, do nothing a keep looping through the list. if 'ATG' in dna_seq: found = dna_seq break # If found is not an empty string then we have found a sequence with ATG. if found: print( f'The first DNA sequence to contain ATG is {found}' ) else: print( 'No DNA sequence contains ATG' )
The first DNA sequence to contain ATG is ATGCACGACGT

As we loop through the list of DNA sequences we test if the start codon "ATG" is in the current sequence. If it isn't we do nothing a move on to the next sequence in the list. However, if "ATG" is in the current sequence we assign the sequence to the variable found (which was previously empty) and immediately break out of the loop without testing the remaining sequences.

If no sequence in the list contains "ATG" then we will loop through all sequences in the list and the variable found will still be empty after we exit the loop normally.

Next we do something we haven't seen before. We test if found is non-empty, which means it contains a DNA sequence. If it is non-empty then the condition

if found:

is True and we print the sequence with "ATG" in it. However, if found is still empty then the condition if found: is False and therefore we print that no DNA sequence with ATG in it was found.

If you remove the "ATG" in the second sequence in the list DNA_sequences and rerun the code you will see that no sequence with a start codon is found.

Nested loops

Loops can be nested, i.e., one loop inside another.

In the following code we count the number of occurrences of each base in each DNA sequence in a list.

Read the following code to see if you understand it then run it.
DNA_sequences = ["GCTACGCTGGC", "ACGCACGACGC", "TAAGCCGGTAG", "AGTTGGAAATC"] # Loop through each DNA sequence in the list one at a time. for dna_seq in DNA_sequences: # Print the DNA sequence. print(dna_seq) # Loop through the four bases A, C, G and T one at a time. for base in 'ACGT': # Count the number of occurrences of the current base in the DNA sequence. count_base = dna_seq.count(base) if count_base == 1: # If there is a single occurrence of the current base print this: print( f'There is {count_base} "{base}"' ) else: # Otherwise there are no, or more than one, occurrences so print this: print( f'There are {count_base} "{base}"s' ) # Print a blank line to separate the output for each DNA sequence. print()
GCTACGCTGGC There is 1 "A" There are 4 "C"s There are 4 "G"s There are 2 "T"s ACGCACGACGC There are 3 "A"s There are 5 "C"s There are 3 "G"s There are 0 "T"s TAAGCCGGTAG There are 3 "A"s There are 2 "C"s There are 4 "G"s There are 2 "T"s AGTTGGAAATC There are 4 "A"s There is 1 "C" There are 3 "G"s There are 3 "T"s

In the top loop we loop through each DNA sequence: dna_seq is the iterating variable.

We then loop through the four bases A, C, G and T in turn. base is the iterating variable. Note that this loop is indented, which means it is inside the previous loop. This loop is repeated once for each DNA sequence.

We use the string method count() to count the number of occurrences of base in dna_seq and print out the result.

We use a conditional statement to test if there is just one occurrence so that the output is grammatically correct.

Exercise Notebook

Next Notebook