Path: blob/master/2019-spring/slides/04_viz.ipynb
2051 views
DSCI 100 - Introduction to Data Science
Lecture 4 - Data visualization in R
2019-01-23
Housekeeping
Grades are coming! Thanks for your patience!!!
Quiz next week!
45 min
open book (but not collaborative!)
in class, but on Canvas
you will get some practice quiz questions by the end of the week
The basic ggplot
call:
Where to get help (and ideas) for creating ggplot2
visualizations?
Only make the plot area as big as needed!
the default size is ridiculous!
use the
repr
package to set your plot size with R in Jupyter
Don’t adjust the axes to zoom in on small differences (if the difference is small, show that its small!)
Show the data (don’t hide the shape/distribution of the data behind a bar)
next two slides borrowed from Jeff Leek
Be wary of overplotting...
Use colors sparingly
Use legends and labels so that your visualization is understandable without reading the surrounding text
Ensure the text on your visualization is big enough to be easily read
Do not use pie charts!
Do not use 3D!
Attribution
images without code mostly come from Fundamentals of Data Visualization by Claus O. Wilke
Go and create!
Make an effective plot!
Can petal length and petal width be used to separate the Iris flower species? Create a plot to answer this question!
What did we learn today
Combine color and shape for points to separate groups in scatter plots
Rarely use 3D or pie charts, there are other, more effective ways of communicating the information
Some rules of thumb/guidelines for making visualizations
How to negatively filter using
!
How to use
options
from therepr
packages to set plot size (and that we should set plot size)!