Path: blob/master/06_Stats/US_Baby_Names/Exercises.ipynb
548 views
Kernel: Python [default]
US - Baby Names
Introduction:
We are going to use a subset of US Baby Names from Kaggle. In the file it will be names from 2004 until 2014
Step 1. Import the necessary libraries
In [ ]:
Step 2. Import the dataset from this address.
Step 3. Assign it to a variable called baby_names.
In [ ]:
Step 4. See the first 10 entries
In [ ]:
Step 5. Delete the column 'Unnamed: 0' and 'Id'
In [ ]:
Step 6. Is there more male or female names in the dataset?
In [ ]:
Step 7. Group the dataset by name and assign to names
In [ ]:
Step 8. How many different names exist in the dataset?
In [ ]:
Step 9. What is the name with most occurrences?
In [ ]:
Step 10. How many different names have the least occurrences?
In [ ]:
Step 11. What is the median name occurrence?
In [ ]:
Step 12. What is the standard deviation of names?
In [ ]:
Step 13. Get a summary with the mean, min, max, std and quartiles.
In [ ]: