Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Download
10327 views
ubuntu2004
Kernel: Python 3 (system-wide)

Dictionaries IV - lookup table

A very common use of a Python dictionary is as a lookup table. Lookup tables are used to convert information from one form into another.

In this Notebook we will look at a simple example of finding the complement of a DNA sequence. In the workshop you will look at a more complicated example: translating DNA into protein.

DNA is double stranded. The two strands are complementary, so an A base on one strand pairs with a T base on the other strand, and G pairs with C. So given one strand, say "AACCGGTT" we know that its complement will be "TTGGCCAA".

Given a particular DNA sequence we can construct its complement using a lookup table like so:

base (key)complement base (value)
AT
CG
GC
TA

Given a string containing a DNA sequence we can construct its entire complement sequence using the above lookup table. These are the steps involved.

  1. Create a dictionary lookup table of each base (key) and its complement (value)

  2. Initialise an empty string to store the complement sequence.

  3. Loop through the DNA sequence one base at a time

  4. Lookup the complement of the current base in the dictionary lookup table and add it to the end of the complement sequence.

The following code implements this algorithm. Read it first to understand what it does then run it.
# A lookup table of complementary DNA bases. complement_table = { 'A':'T', 'C':'G', 'G':'C', 'T':'A'} # An example DNA sequence. dna_seq = 'TTTATGTATCCTTATATCACAACTCGAAGATTCTTCTTCTGCACGAGAAGCGTGGGAATCATGGAATAA' # Assign an empty string to complement_seq. We will construct this sequence one base at a time as we loop through the DNA sequence. complement_seq = '' # Loop through the DNA sequence. base is the iterating variable. for base in dna_seq: # Look up the complement base of the current base in the lookup table complement_base = complement_table[base] # Add the complement base to the end of the complement sequence. complement_seq += complement_base # Print the DNA sequence and its complement for comparison. print(dna_seq) print(complement_seq)
TTTATGTATCCTTATATCACAACTCGAAGATTCTTCTTCTGCACGAGAAGCGTGGGAATCATGGAATAA AAATACATAGGAATATAGTGTTGAGCTTCTAAGAAGAAGACGTGCTCTTCGCACCCTTAGTACCTTATT

The key line is:

complement_base = complement_table[base]

This is where we convert the current base in the DNA sequence to its complement using the lookup table.

Exercise Notebook

Next Notebook