Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Download
10327 views
ubuntu2004
Kernel: Python 3 (system-wide)

Strings I - working with text

In biology, many types of data are predominantly text, for example, DNA and protein sequences, survey replies, species names or gene identifiers.

Python refers to text as strings - as in a string of characters. A string can be any length and made up of any letters, numbers and punctuation.

We've already used a string: "Hello, world!", in Notebook 2.

A string must start and end with either single quotes or double quotes, as long as both quotes are the same. So both

'Hello, world!' "Hello, world!"

are identical strings.

Length of a string

The number of characters in a string is called its length. This includes all letters and any spaces and punctuation.

The function len() returns the length of a string as demonstrated in the following code.

# Get the length of a sentence and assign it to the variable called sentence_length. sentence_length = len( 'Hello, world!' ) # Use an f-string to output the length of the sentence. print( f'The length of the sentence is {sentence_length} characters' )
The length of the sentence is 13 characters

String variables

Just like numbers, we can assign strings to variables so that we can store them in memory, recall them and modify them.

The code below shows how to assign a string to a variable called sentence. As for a number variable, we can print out the value and type of a string variable. A string variable has type str.

# Assign the string 'Hello, world!' to the string variable called sentence. sentence = 'Hello, world!' print( sentence ) # Print the value of sentence. print( len(sentence) ) # Print the length of the sentence. print( type(sentence) ) # Print the type of sentence. print( 'sentence' ) # 'sentence' in quotes is a string not a variable.
Hello, world! 13 <class 'str'> sentence

Notice that when you print a string the quotes are not included.

Also notice that the variable sentence is not in quotes inside the print() function:

print( sentence )

This is because sentence is a string variable and not a string. In other words, sentence is a variable with value 'Hello, world!' and type string. Whereas 'sentence' is just a string.

The empty string

The empty string is simply a pair of quotes, either single or double, with nothing in between them.

# Assign the empty string to a variable called name. name = ''

Adding to the end of a string: Concatenation

Why is an empty string useful? Sometimes we want to build up a string one letter or word at a time.

There are two ways of doing this: with the + operator or the += operator. Both are shown in the following code.

# Assign the empty string to a variable called name. name = '' # Add the string 'Harry' to the value of name and reassign to name. name = name + 'Harry' print(name) # Add the string ' Potter' to the end of name. name += ' Potter' print(name)
Harry Harry Potter

Notice the explicit space character in ' Potter' so that the two names are separated by a space.

Input

To ask the user for input we use the input() function.

To see how this works run the following code.

Python asks for input with the prompt "Enter a name:". It assigns your input to the string variable called name which it then prints in an f-string.

# Ask for input with the prompt "Enter a name" and assign the input to the string variable called name. name = input('Enter a name: ') print( f'Hello {name}' )
Enter a name:
Hello artur

Casting: Converting a string to a number

Say you wanted to input two numbers and print their product.

1. Look at the following code and see whether you think it will work or not.

2. Run it and enter two numbers when prompted; any numbers will do.
# Ask for the first number and assign to the variable called number1. number1 = input('Enter a number: ') # Ask for the second number and assign to the variable called number2. number2 = input('Enter another number: ') # Print the product of the two numbers. print( number1 * number2 )

The code produced an error, in this case a TypeError. A TypeError means you are trying to do something with the wrong type of variable. Remember, as well as a variable having a name and a value it also has a type.

So far we've seen integer (int), decimal (float) and string (str) variables.

The code above doesn't work because the input() function only returns a string variable. Even though you entered a number it is treated as a string. For example, "3.1" is a string because it is enclosed by double quotes.

This means the line

print( number1 * number2 )

is trying to multiply two strings together rather than two numbers. And that we cannot do.

To fix this we need to convert the strings assigned to number1 and number2 to floats. This is called casting: the conversion of one data type to another. To do this we use the float() function.

Run the following code to see how it is done.
# Ask for the first number. number1 = input('Enter a number: ') # Ask for the second number. number2 = input('Enter another number: ') # Print the product of the two numbers but cast them to floats first. print( float(number1) * float(number2) )

Newlines and tabs: Escape characters

Sometimes text includes invisible characters that indicate how the text should be formatted - such as tabs or new line characters. Since it would be difficult to see these characters in our code, we use escape characters in Python strings instead. Escape characters are characters prefixed with a backslash, for example tabs and newline characters are represented by the following escape characters:

format typeescape character
new line\n
tab\t
# Separate forename and surname with tabs and rows by newlines. print( 'Forename\tSurname\n-----------------------\nHermione\tGranger\nRon\t\tWeasley' )
Forename Surname ----------------------- Hermione Granger Ron Weasley

Exercise Notebook

Next Notebook