Path: blob/master/april_18/projects/unit-projects/project-1/assets/project1-example.ipynb
1905 views
Kernel: Python 2
Project 1 example
Read and evaluate the following problem statement:
Using Planet Express customer data from January 3001-3005, determine how likely previous customers are to request a repeat delivery using demographic information (profession, company size, location) and previous delivery data (days since last delivery, number of total deliveries).
1. What is the outcome?
Answer: return customer indicator (yes/no)
2. What are the predictors/covariates?
Answer: age, gender, location, date of first deliveries and profession, days since last delivery, number of total deliveries
3. What timeframe is this data relevent for?
Answer: Jan 3001-3005
4. What is the hypothesis?
Answer: Demographic and previous delivery info will allow us to predict if a customer will be a repeat customer
Let's begin by exploring the dataset
1. create a data dictionary
Answer:
Variable | Description | Type of Variable |
---|---|---|
Profession | Title of the account owner | categorical |
Company Size | 1- small, 2- medium, 3- large | categorical |
Location | planet of the company | categorical |
Days Since Last Delivery | integer | continuous |
Number of Deliveries | integer | continuous |
In [ ]: