Path: blob/master/lessons/lesson_08/extra-materials/titanic_confusion.ipynb
1904 views
Kernel: Python 3
Logistic regression exercise with Titanic data
Introduction
Data from Kaggle's Titanic competition: data, data dictionary
Goal: Predict survival based on passenger characteristics
titanic.csv
is already in our repo, so there is no need to download the data from the Kaggle website
Step 1: Read the data into Pandas
In [ ]:
Step 2: Create X and y
Define Pclass and Parch as the features, and Survived as the response.
In [ ]:
Step 3: Split the data into training and testing sets
In [ ]:
Step 4: Fit a logistic regression model and examine the coefficients
Confirm that the coefficients make intuitive sense.
In [ ]:
Step 5: Make predictions on the testing set and calculate the accuracy
In [ ]:
In [ ]:
Step 6: Compare your testing accuracy to the null accuracy
In [ ]:
In [ ]:
Confusion matrix of Titanic predictions
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]: