Path: blob/master/april_18/projects/unit-projects/project-2/starter-code/project2-starter.ipynb
1905 views
Kernel: Python 2
Project 2
In this project, you will implement the exploratory analysis plan developed in Project 1. This will lay the groundwork for our our first modeling exercise in Project 3.
Step 1: Load the python libraries you will need for this project
In [1]:
Step 2: Read in your data set
In [2]:
Out[2]:
admit gre gpa prestige
0 0 380 3.61 3
1 1 660 3.67 3
2 1 800 4.00 1
3 1 640 3.19 4
4 0 520 2.93 4
Questions
Question 1. How many observations are in our dataset?
In [3]:
Out[3]:
admit 400
gre 398
gpa 398
prestige 399
dtype: int64
Answer:
Question 2. Create a summary table
In [ ]:
In [ ]:
Question 3. Why would GRE have a larger STD than GPA?
Answer:
Question 4. Drop data points with missing data
In [ ]:
Question 5. Confirm that you dropped the correct data. How can you tell?
Answer:
Question 6. Create box plots for GRE and GPA
In [ ]:
In [ ]:
Question 7. What do this plots show?
Answer:
Question 8. Describe each distribution
In [ ]:
Question 9. If our model had an assumption of a normal distribution would we meet that requirement?
Answer:
Question 10. Does this distribution need correction? If so, why? How?
Answer:
Question 11. Which of our variables are potentially colinear?
In [ ]:
Question 12. What did you find?
Answer:
Question 13. Write an analysis plan for exploring the association between grad school admissions rates and prestige of undergraduate schools.
Answer:
Question 14. What is your hypothesis?
Answer: