ubuntu2004
Workshop 3
Lists and loops
Task 3.1
We've seen that the string method find()
returns the index of the first occurrence of a character within a string. Often we want to find all indicies of a character in a string.
Write some code that loops through the following DNA sequence and outputs the indicies of all occurrences of the base "T".
Hint: You should be able to modify your code from Exercise 10.3 for this task.
Task 3.3
Find and print the middle (also called the median) value of the sorted list of eggs_laid
. Do not find the middle value by hand: find it using code.
Hint 1: Use the list's length. For example, if the length is 5, the middle value is at the 3rd position or index 2.
Hint 2: If you get the error "list indices must be integers or slices, not float", remember that the result of the division operator
/
is a float even when the divisor and dividend are integers. That means you need integer division//
.
Task 3.4
There are three steps in calculating a median of a list of numbers:
Sort the values from lowest to highest.
Find the number of values.
Find the middle value
If the number of values is odd the median is the middle value.
If the number of values is even the median is the average of the middle two values.
E.g., the median of the values [0.5, 0.6, 0.9] is 0.6. The median of the values [0.5, 0.6, 0.9, 1.1] is (0.6+0.9)/2 = 0.75.
Write code to calculate the median of a list of numbers of any length.
Apply your code to finding the median of the following two lists.
Hint 1: You will need to sort the list, find its length, make a decision on which are the middle value(s).
Hint 2: For a list with an even number of values be careful to select the middle elements with the correct indicies.
Task 3.9
A single nighttime survey of bats in the Forest of Dean produced a list of the species of all individual bats caught, measured and released.
Create a new list of unique bat species caught.
Print the list of unique species and the number of unique species.
Hint: Create an empty list and only append a bat species to this list if it is not already in the list as you loop through
bat_list
.
Task 3.10
In the bat survey, each bat's wingspan was measured. These are given in centimetres, for each bat, in the list wingspans
below.
Using the bat_list
and wingspans
lists, print the average wingspan of Pipistrellus nathusii to 2dp.
Hint 1: Loop through the two lists simultaneously. Each time you encounter Pipistrellus nathusii in
bat_list
append its wingspan to another list. Once finished loopingbat_list
, calculate the average wingspan using the list of wingspans you have constructed.Hint 2: Rather than summing the values in the list by looping over the list as we did in Notebook 12, you might want to use the inbuilt
sum()
function. Google it to find out how to use it.
Task 3.11
Repeated, short sequences are of interest to geneticists as they suggest the existence of transposible elements within genomic DNA.
Search for and print the first sequence in the following list that starts with "TATA" and has a second "TATA" repeat later in the sequence.
If no sequences are found then print that none was found.
Hint: Loop through the DNA sequences. If a sequence starts with "TATA" test whether the sequence contains a second "TATA" substring. If it does break out of the loop otherwise move onto the next sequence.