ubuntu2004
Workshop 4
Dictionaries
Task 4.1
The table below tabulates the average diameters of white blood cells in human blood.
Cell type | Diameter (m) |
---|---|
Neutrophil | 11 |
Eosinophil | 11 |
Basophil | 13.5 |
Small lymphocyte | 7.5 |
Large lymphocyte | 13.5 |
Monocyte | 22.5 |
By simultaneously looping through the two following lists, construct a dictionary with cell types as keys and diameters as values.
Hint: Begin with an empty dictionary and, as you loop through the lists, add key:value pairs.
Task 4.3
Below is the list of bats identified in the bat survey from the Forest of Dean.
Use dictionaries to answer these questions:
How many of each bat species were observed?
How many different bat species were observed?
Hint: Remember, each key in a dictionary is unique.
How many species in the genus Pipistrellus were observed?
Hint 1: Loop through the keys of the dictionary you constructed and count the number of keys starting with "Pipistrellus".
Hint 2: You might like to google the string method
startswith()
.
Task 4.5
Protein synthesis begins at the start codon "ATG". In Task 4.4 we started translating the DNA sequence at its first base. Instead we should have started translation at the first "ATG" codon.
In Task 2.11 you wrote some code to find the index of "ATG" in a DNA sequence.
Incorporate that code into your DNA translation code to start translation at "ATG" and not before.
You should first check if "ATG" is in the DNA sequence. If it isn't print that the DNA sequence does not contain a start codon, otherwise translate the sequence as normal.
Task 4.6
It is useful to be able to search long DNA sequences to find palindromic sequences.
Write a program to print all palindromic sequences between 4 and 8 basepairs long inclusive in the following sequence.
Hint 1: This is a difficult task so you should write an algorithm on paper first before attempting to code it. By writing an algorithm first you are able to spot potential problems early instead of staring blankly at a broken piece of code.
Hint 2: You will need two nested loops. The outermost loop to go through the different palindrome sizes (4 to 8), the first nested loop to go through the DNA sequence from left to right, and a second nested loop to construct the reverse complement of all substrings.