Path: blob/master/april_18/lessons/lesson-05/code/Jupyter Practice.ipynb
1904 views
Markdown
In Jupyter notebooks (and on github) we can use Markdown syntax to make nice looking text, include links and images, render code and equations, and organize our presentations.
When you make a new cell in jupyter by either:
Pressing the hotkey combo "shift-enter" to execute a tab
Selecting insert cell from the insert menu
You'll notice that the default cell content-type is "code". To make a cell into a markdown cell, either:
Select "Markdown" from the drop-down menu above, or
Use the hotkey "Ctrl-m" and then press "m"
First exercise
Make this cell into a markdown cell
Run the cell using shift-enter
Hotkeys
There are a ton of useful hotkeys in Jupyter. Pressing "shift-m" puts you into command mode. Pushing another key then usually does something useful. For example,
shift-m
thena
inserts a new cell above this oneshift-m
thenb
inserts a new cell below
You can find a list of hotkeys here.
Exercise: Practice by making at least one cell above this one and one cell below.
Rendering Code and Equations in Markdown
Enclosing text in backticks will render the text as code. For example, y = a * x + b
renders code inline, and three backticks renders code as a block:
If you include the language, Jupyter will color the code nicely:
If you happen to know about LaTeX, you can also render math equations using two dollar signs $$
like so:
If the cell already rendered the equation, double-click somewhere in the cell to see the syntax.
Making Links
In Markdown we can link to other sites like so:: [link-name](link-URL)
, for example: Google. Double-click on the cell to see the synatax.
Exercise:
Insert a new cell below this one
Make a link to General Assembly's webpage
Make a link to our github repository
Embedding Images
We can also embed images. This is a famous visualization of Napolean's failed invasion of Russia.
In a new cell, insert an image of your favorite sports team's logo, your favorite animal, or
More Markdown
As data scientists we're often expected to read documentation and apply
Here's a good list of markdown commands. Take a look and then complete the following exercises.
Make three different level headers
Make an ordered list of the data science workflow (seven items!)
Embed the data science workflow image in a new cell (you can find the image on our github page)
Use markdown to make one word in the following sentence bold, one italic, and one strike-through:
The quick brown fox jumps over the lazy dog.
My awesome list
First item
Second item
sub item
The quick brown fox jumps over the lazy dog.
Python Practice
Now that we've learned a bit about markdown, try to get into the habit of using markdown to explain and break up your code. This is a good habit because it:
Keeps your code organized
Makes it easier to follow your logic when you present or share
Makes it easier for you to recall your intentions later
Now let's switch gears and practice using Python.
Anything you can do in python you can do in a Jupyter notebook. That includes importing libraries, defining functions, and making plots. Work through the following exercises, and feel free to search for help.
Exercise 1
Print your name using the print
command
Exercise 2
Create a list of the first 10 integers using range
Exercise 3
Write a function that takes one variable n
and sums the squares of the positive numbers less than or equal to n
. There are a couple of ways to do this: list comprehensions, a loop, and probably others.
Exercise 4
Use your function to find the sum of the first 5 squares.
Exercise 5
It turns out that there is a nice formula for the sum of squares:
Write a new function sum_squares2
that uses this formula. If you used this method above for sum_squares
, write the function using a loop instead.
Exercise 6
Compute the sum of the first 20 squares. Do your functions agree?
Practical Exercises
Hopefully so far so good! Let's learn about some useful libraries now.
Downloading data
We can even download data inside a notebook. This is really useful when you want to scrape websites for data, and for a lot of other purposes like connecting to APIs. Let's try this out using the requests
package. If for some reason this isn't installed, you can install it with conda using the graphical interface or from the commandline:
We'll download the famous Boston Housing dataset from the UCI machine learning repository.
Exercise: Try downloading another dataset from the UCI repository (any one you like is fine).
Now let's open the file with pandas
. We'll spend a lot of time getting comfortable with pandas
next week. For now let's just load the data into a data frame.
The data doesn't include the column names, so we'll have to add those manually. (Data science isn't always glamorous!) You can find an explanation of the data at the link above.
Take a look at the data
Next Steps
If you've made it this far you are doing great! Get a jump start on the next class by poking around with pandas:
Compute the mean and standard deviation of some of the columns. You can get the data like so:
data_frame["CRIM"]
Try to make a histogram of some of the columns.
Compute the correlation matrix with
data_frame.corr()
. Notice anything interesting?Try to make a scatter plot of some of the strongly correlated columns.
Take a look at the pandas
visualization documentation for some ideas on how to make plots.
Try to sketch out a plan of how to use the data science workflow on this dataset. Start by formulating a problem, such as predicting property tax (TAX) from the other variables. What steps have you already completed in the DSW?
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-42-085e2dd1e3fc> in <module>()
----> 1 my_dict[0]
KeyError: 0