Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
codebasics
GitHub Repository: codebasics/deep-learning-keras-tf-tutorial
Path: blob/master/11_chrun_prediction/churn.ipynb
1141 views
Kernel: Python 3

Customer Churn Prediction Using Artificial Neural Network (ANN)

Customer churn prediction is to measure why customers are leaving a business. In this tutorial we will be looking at customer churn in telecom business. We will build a deep learning model to predict the churn and use precision,recall, f1-score to measure performance of our model

import pandas as pd from matplotlib import pyplot as plt import numpy as np %matplotlib inline

Load the data

df = pd.read_csv("customer_churn.csv") df.sample(5)

First of all, drop customerID column as it is of no use

df.drop('customerID',axis='columns',inplace=True)
df.dtypes
gender object SeniorCitizen int64 Partner object Dependents object tenure int64 PhoneService object MultipleLines object InternetService object OnlineSecurity object OnlineBackup object DeviceProtection object TechSupport object StreamingTV object StreamingMovies object Contract object PaperlessBilling object PaymentMethod object MonthlyCharges float64 TotalCharges object Churn object dtype: object

Quick glance at above makes me realize that TotalCharges should be float but it is an object. Let's check what's going on with this column

df.TotalCharges.values
array(['29.85', '1889.5', '108.15', ..., '346.45', '306.6', '6844.5'], dtype=object)

Ahh... it is string. Lets convert it to numbers

pd.to_numeric(df.TotalCharges)
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) pandas\_libs\lib.pyx in pandas._libs.lib.maybe_convert_numeric() ValueError: Unable to parse string " " During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last) <ipython-input-256-06ba430a4ba5> in <module> ----> 1 pd.to_numeric(df.TotalCharges) ~\AppData\Roaming\Python\Python38\site-packages\pandas\core\tools\numeric.py in to_numeric(arg, errors, downcast) 150 coerce_numeric = errors not in ("ignore", "raise") 151 try: --> 152 values = lib.maybe_convert_numeric( 153 values, set(), coerce_numeric=coerce_numeric 154 ) pandas\_libs\lib.pyx in pandas._libs.lib.maybe_convert_numeric() ValueError: Unable to parse string " " at position 488

Hmmm... some values seems to be not numbers but blank string. Let's find out such rows

pd.to_numeric(df.TotalCharges,errors='coerce').isnull()
0 False 1 False 2 False 3 False 4 False ... 7038 False 7039 False 7040 False 7041 False 7042 False Name: TotalCharges, Length: 7043, dtype: bool
df[pd.to_numeric(df.TotalCharges,errors='coerce').isnull()]
df.shape
(7043, 20)
df.iloc[488].TotalCharges
' '
df[df.TotalCharges!=' '].shape
(7032, 20)

Remove rows with space in TotalCharges

df1 = df[df.TotalCharges!=' '] df1.shape
(7032, 20)
df1.dtypes
gender object SeniorCitizen int64 Partner object Dependents object tenure int64 PhoneService object MultipleLines object InternetService object OnlineSecurity object OnlineBackup object DeviceProtection object TechSupport object StreamingTV object StreamingMovies object Contract object PaperlessBilling object PaymentMethod object MonthlyCharges float64 TotalCharges object Churn object dtype: object
df1.TotalCharges = pd.to_numeric(df1.TotalCharges)
C:\Users\dhava\AppData\Roaming\Python\Python38\site-packages\pandas\core\generic.py:5159: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy self[name] = value
df1.TotalCharges.values
array([ 29.85, 1889.5 , 108.15, ..., 346.45, 306.6 , 6844.5 ])
df1[df1.Churn=='No']

Data Visualization

tenure_churn_no = df1[df1.Churn=='No'].tenure tenure_churn_yes = df1[df1.Churn=='Yes'].tenure plt.xlabel("tenure") plt.ylabel("Number Of Customers") plt.title("Customer Churn Prediction Visualiztion") blood_sugar_men = [113, 85, 90, 150, 149, 88, 93, 115, 135, 80, 77, 82, 129] blood_sugar_women = [67, 98, 89, 120, 133, 150, 84, 69, 89, 79, 120, 112, 100] plt.hist([tenure_churn_yes, tenure_churn_no], rwidth=0.95, color=['green','red'],label=['Churn=Yes','Churn=No']) plt.legend()
<matplotlib.legend.Legend at 0x2181d04b700>
Image in a Jupyter notebook
mc_churn_no = df1[df1.Churn=='No'].MonthlyCharges mc_churn_yes = df1[df1.Churn=='Yes'].MonthlyCharges plt.xlabel("Monthly Charges") plt.ylabel("Number Of Customers") plt.title("Customer Churn Prediction Visualiztion") blood_sugar_men = [113, 85, 90, 150, 149, 88, 93, 115, 135, 80, 77, 82, 129] blood_sugar_women = [67, 98, 89, 120, 133, 150, 84, 69, 89, 79, 120, 112, 100] plt.hist([mc_churn_yes, mc_churn_no], rwidth=0.95, color=['green','red'],label=['Churn=Yes','Churn=No']) plt.legend()
<matplotlib.legend.Legend at 0x2181d15fac0>
Image in a Jupyter notebook

Many of the columns are yes, no etc. Let's print unique values in object columns to see data values

def print_unique_col_values(df): for column in df: if df[column].dtypes=='object': print(f'{column}: {df[column].unique()}')
print_unique_col_values(df1)
gender: ['Female' 'Male'] Partner: ['Yes' 'No'] Dependents: ['No' 'Yes'] PhoneService: ['No' 'Yes'] MultipleLines: ['No phone service' 'No' 'Yes'] InternetService: ['DSL' 'Fiber optic' 'No'] OnlineSecurity: ['No' 'Yes' 'No internet service'] OnlineBackup: ['Yes' 'No' 'No internet service'] DeviceProtection: ['No' 'Yes' 'No internet service'] TechSupport: ['No' 'Yes' 'No internet service'] StreamingTV: ['No' 'Yes' 'No internet service'] StreamingMovies: ['No' 'Yes' 'No internet service'] Contract: ['Month-to-month' 'One year' 'Two year'] PaperlessBilling: ['Yes' 'No'] PaymentMethod: ['Electronic check' 'Mailed check' 'Bank transfer (automatic)' 'Credit card (automatic)'] Churn: ['No' 'Yes']

Some of the columns have no internet service or no phone service, that can be replaced with a simple No

df1.replace('No internet service','No',inplace=True) df1.replace('No phone service','No',inplace=True)
C:\Users\dhava\AppData\Roaming\Python\Python38\site-packages\pandas\core\frame.py:4373: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy return super().replace(
print_unique_col_values(df1)
gender: ['Female' 'Male'] Partner: ['Yes' 'No'] Dependents: ['No' 'Yes'] PhoneService: ['No' 'Yes'] MultipleLines: ['No' 'Yes'] InternetService: ['DSL' 'Fiber optic' 'No'] OnlineSecurity: ['No' 'Yes'] OnlineBackup: ['Yes' 'No'] DeviceProtection: ['No' 'Yes'] TechSupport: ['No' 'Yes'] StreamingTV: ['No' 'Yes'] StreamingMovies: ['No' 'Yes'] Contract: ['Month-to-month' 'One year' 'Two year'] PaperlessBilling: ['Yes' 'No'] PaymentMethod: ['Electronic check' 'Mailed check' 'Bank transfer (automatic)' 'Credit card (automatic)'] Churn: ['No' 'Yes']

Convert Yes and No to 1 or 0

yes_no_columns = ['Partner','Dependents','PhoneService','MultipleLines','OnlineSecurity','OnlineBackup', 'DeviceProtection','TechSupport','StreamingTV','StreamingMovies','PaperlessBilling','Churn'] for col in yes_no_columns: df1[col].replace({'Yes': 1,'No': 0},inplace=True)
C:\Users\dhava\AppData\Roaming\Python\Python38\site-packages\pandas\core\series.py:4563: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy return super().replace(
for col in df1: print(f'{col}: {df1[col].unique()}')
gender: ['Female' 'Male'] SeniorCitizen: [0 1] Partner: [1 0] Dependents: [0 1] tenure: [ 1 34 2 45 8 22 10 28 62 13 16 58 49 25 69 52 71 21 12 30 47 72 17 27 5 46 11 70 63 43 15 60 18 66 9 3 31 50 64 56 7 42 35 48 29 65 38 68 32 55 37 36 41 6 4 33 67 23 57 61 14 20 53 40 59 24 44 19 54 51 26 39] PhoneService: [0 1] MultipleLines: [0 1] InternetService: ['DSL' 'Fiber optic' 'No'] OnlineSecurity: [0 1] OnlineBackup: [1 0] DeviceProtection: [0 1] TechSupport: [0 1] StreamingTV: [0 1] StreamingMovies: [0 1] Contract: ['Month-to-month' 'One year' 'Two year'] PaperlessBilling: [1 0] PaymentMethod: ['Electronic check' 'Mailed check' 'Bank transfer (automatic)' 'Credit card (automatic)'] MonthlyCharges: [29.85 56.95 53.85 ... 63.1 44.2 78.7 ] TotalCharges: [ 29.85 1889.5 108.15 ... 346.45 306.6 6844.5 ] Churn: [0 1]
df1['gender'].replace({'Female':1,'Male':0},inplace=True)
df1.gender.unique()
array([1, 0], dtype=int64)

One hot encoding for categorical columns

df2 = pd.get_dummies(data=df1, columns=['InternetService','Contract','PaymentMethod']) df2.columns
Index(['gender', 'SeniorCitizen', 'Partner', 'Dependents', 'tenure', 'PhoneService', 'MultipleLines', 'OnlineSecurity', 'OnlineBackup', 'DeviceProtection', 'TechSupport', 'StreamingTV', 'StreamingMovies', 'PaperlessBilling', 'MonthlyCharges', 'TotalCharges', 'Churn', 'InternetService_DSL', 'InternetService_Fiber optic', 'InternetService_No', 'Contract_Month-to-month', 'Contract_One year', 'Contract_Two year', 'PaymentMethod_Bank transfer (automatic)', 'PaymentMethod_Credit card (automatic)', 'PaymentMethod_Electronic check', 'PaymentMethod_Mailed check'], dtype='object')
df2.sample(5)
df2.dtypes
gender int64 SeniorCitizen int64 Partner int64 Dependents int64 tenure int64 PhoneService int64 MultipleLines int64 OnlineSecurity int64 OnlineBackup int64 DeviceProtection int64 TechSupport int64 StreamingTV int64 StreamingMovies int64 PaperlessBilling int64 MonthlyCharges float64 TotalCharges float64 Churn int64 InternetService_DSL uint8 InternetService_Fiber optic uint8 InternetService_No uint8 Contract_Month-to-month uint8 Contract_One year uint8 Contract_Two year uint8 PaymentMethod_Bank transfer (automatic) uint8 PaymentMethod_Credit card (automatic) uint8 PaymentMethod_Electronic check uint8 PaymentMethod_Mailed check uint8 dtype: object
cols_to_scale = ['tenure','MonthlyCharges','TotalCharges'] from sklearn.preprocessing import MinMaxScaler scaler = MinMaxScaler() df2[cols_to_scale] = scaler.fit_transform(df2[cols_to_scale])
for col in df2: print(f'{col}: {df2[col].unique()}')
gender: [1 0] SeniorCitizen: [0 1] Partner: [1 0] Dependents: [0 1] tenure: [0. 0.46478873 0.01408451 0.61971831 0.09859155 0.29577465 0.12676056 0.38028169 0.85915493 0.16901408 0.21126761 0.8028169 0.67605634 0.33802817 0.95774648 0.71830986 0.98591549 0.28169014 0.15492958 0.4084507 0.64788732 1. 0.22535211 0.36619718 0.05633803 0.63380282 0.14084507 0.97183099 0.87323944 0.5915493 0.1971831 0.83098592 0.23943662 0.91549296 0.11267606 0.02816901 0.42253521 0.69014085 0.88732394 0.77464789 0.08450704 0.57746479 0.47887324 0.66197183 0.3943662 0.90140845 0.52112676 0.94366197 0.43661972 0.76056338 0.50704225 0.49295775 0.56338028 0.07042254 0.04225352 0.45070423 0.92957746 0.30985915 0.78873239 0.84507042 0.18309859 0.26760563 0.73239437 0.54929577 0.81690141 0.32394366 0.6056338 0.25352113 0.74647887 0.70422535 0.35211268 0.53521127] PhoneService: [0 1] MultipleLines: [0 1] OnlineSecurity: [0 1] OnlineBackup: [1 0] DeviceProtection: [0 1] TechSupport: [0 1] StreamingTV: [0 1] StreamingMovies: [0 1] PaperlessBilling: [1 0] MonthlyCharges: [0.11542289 0.38507463 0.35422886 ... 0.44626866 0.25820896 0.60149254] TotalCharges: [0.0012751 0.21586661 0.01031041 ... 0.03780868 0.03321025 0.78764136] Churn: [0 1] InternetService_DSL: [1 0] InternetService_Fiber optic: [0 1] InternetService_No: [0 1] Contract_Month-to-month: [1 0] Contract_One year: [0 1] Contract_Two year: [0 1] PaymentMethod_Bank transfer (automatic): [0 1] PaymentMethod_Credit card (automatic): [0 1] PaymentMethod_Electronic check: [1 0] PaymentMethod_Mailed check: [0 1]

Train test split

X = df2.drop('Churn',axis='columns') y = df2['Churn'] from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2,random_state=5)
X_train.shape
(5625, 26)
X_test.shape
(1407, 26)
X_train[:10]
len(X_train.columns)
26

Build a model (ANN) in tensorflow/keras

import tensorflow as tf from tensorflow import keras model = keras.Sequential([ keras.layers.Dense(26, input_shape=(26,), activation='relu'), keras.layers.Dense(15, activation='relu'), keras.layers.Dense(1, activation='sigmoid') ]) # opt = keras.optimizers.Adam(learning_rate=0.01) model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) model.fit(X_train, y_train, epochs=100)
Epoch 1/100 176/176 [==============================] - 0s 1ms/step - loss: 0.4822 - accuracy: 0.7623 Epoch 2/100 176/176 [==============================] - 0s 1ms/step - loss: 0.4269 - accuracy: 0.8000 Epoch 3/100 176/176 [==============================] - 0s 1ms/step - loss: 0.4182 - accuracy: 0.7984 Epoch 4/100 176/176 [==============================] - 0s 1ms/step - loss: 0.4153 - accuracy: 0.8046 Epoch 5/100 176/176 [==============================] - 0s 1ms/step - loss: 0.4127 - accuracy: 0.8078 Epoch 6/100 176/176 [==============================] - 0s 1ms/step - loss: 0.4108 - accuracy: 0.8073 Epoch 7/100 176/176 [==============================] - 0s 1ms/step - loss: 0.4084 - accuracy: 0.8057 Epoch 8/100 176/176 [==============================] - 0s 1ms/step - loss: 0.4070 - accuracy: 0.8108 Epoch 9/100 176/176 [==============================] - 0s 1ms/step - loss: 0.4059 - accuracy: 0.8107 Epoch 10/100 176/176 [==============================] - 0s 1ms/step - loss: 0.4043 - accuracy: 0.8107 Epoch 11/100 176/176 [==============================] - 0s 1ms/step - loss: 0.4037 - accuracy: 0.8110 Epoch 12/100 176/176 [==============================] - 0s 1ms/step - loss: 0.4020 - accuracy: 0.8114 Epoch 13/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3996 - accuracy: 0.8128 Epoch 14/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3992 - accuracy: 0.8132 Epoch 15/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3982 - accuracy: 0.8119 Epoch 16/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3973 - accuracy: 0.8105 Epoch 17/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3955 - accuracy: 0.8128 Epoch 18/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3939 - accuracy: 0.8126 Epoch 19/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3936 - accuracy: 0.8149 Epoch 20/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3930 - accuracy: 0.8155 Epoch 21/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3920 - accuracy: 0.8151 Epoch 22/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3912 - accuracy: 0.8148 Epoch 23/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3896 - accuracy: 0.8162 Epoch 24/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3897 - accuracy: 0.8162 Epoch 25/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3876 - accuracy: 0.8174 Epoch 26/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3864 - accuracy: 0.8187 Epoch 27/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3864 - accuracy: 0.8172 Epoch 28/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3846 - accuracy: 0.8181 Epoch 29/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3846 - accuracy: 0.8172 Epoch 30/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3834 - accuracy: 0.8187 Epoch 31/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3812 - accuracy: 0.8197 Epoch 32/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3815 - accuracy: 0.8180 Epoch 33/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3811 - accuracy: 0.8199 Epoch 34/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3806 - accuracy: 0.8178 Epoch 35/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3799 - accuracy: 0.8219 Epoch 36/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3787 - accuracy: 0.8185 Epoch 37/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3775 - accuracy: 0.8236 Epoch 38/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3783 - accuracy: 0.8212 Epoch 39/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3769 - accuracy: 0.8229 Epoch 40/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3760 - accuracy: 0.8224 Epoch 41/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3757 - accuracy: 0.8199 Epoch 42/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3749 - accuracy: 0.8260 Epoch 43/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3738 - accuracy: 0.8238 Epoch 44/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3727 - accuracy: 0.8228 Epoch 45/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3725 - accuracy: 0.8242 Epoch 46/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3722 - accuracy: 0.8245 Epoch 47/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3718 - accuracy: 0.8252 Epoch 48/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3716 - accuracy: 0.8244 Epoch 49/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3706 - accuracy: 0.8240 Epoch 50/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3703 - accuracy: 0.8224 Epoch 51/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3682 - accuracy: 0.8279 Epoch 52/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3695 - accuracy: 0.8233 Epoch 53/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3678 - accuracy: 0.8251 Epoch 54/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3671 - accuracy: 0.8261 Epoch 55/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3666 - accuracy: 0.8251 Epoch 56/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3656 - accuracy: 0.8251 Epoch 57/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3650 - accuracy: 0.8263 Epoch 58/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3643 - accuracy: 0.8268 Epoch 59/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3646 - accuracy: 0.8284 Epoch 60/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3641 - accuracy: 0.8252 Epoch 61/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3639 - accuracy: 0.8228 Epoch 62/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3630 - accuracy: 0.8299 Epoch 63/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3617 - accuracy: 0.8277 Epoch 64/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3622 - accuracy: 0.8284 Epoch 65/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3615 - accuracy: 0.8276 Epoch 66/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3623 - accuracy: 0.8263 Epoch 67/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3603 - accuracy: 0.8281 Epoch 68/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3600 - accuracy: 0.8284 Epoch 69/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3602 - accuracy: 0.8293 Epoch 70/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3596 - accuracy: 0.8288 Epoch 71/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3587 - accuracy: 0.8276 Epoch 72/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3585 - accuracy: 0.8290 Epoch 73/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3581 - accuracy: 0.8277 Epoch 74/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3582 - accuracy: 0.8311 Epoch 75/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3573 - accuracy: 0.8272 Epoch 76/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3575 - accuracy: 0.8277 Epoch 77/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3573 - accuracy: 0.8306 Epoch 78/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3564 - accuracy: 0.8288 Epoch 79/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3550 - accuracy: 0.8313 Epoch 80/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3550 - accuracy: 0.8324 Epoch 81/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3548 - accuracy: 0.8284 Epoch 82/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3552 - accuracy: 0.8329 Epoch 83/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3556 - accuracy: 0.8279 Epoch 84/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3534 - accuracy: 0.8331 Epoch 85/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3533 - accuracy: 0.8299 Epoch 86/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3536 - accuracy: 0.8332 Epoch 87/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3536 - accuracy: 0.8325 Epoch 88/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3505 - accuracy: 0.8356 Epoch 89/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3517 - accuracy: 0.8311 Epoch 90/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3513 - accuracy: 0.8313 Epoch 91/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3525 - accuracy: 0.8309 Epoch 92/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3495 - accuracy: 0.8334 Epoch 93/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3506 - accuracy: 0.8270 Epoch 94/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3495 - accuracy: 0.8350 Epoch 95/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3497 - accuracy: 0.8327 Epoch 96/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3500 - accuracy: 0.8338 Epoch 97/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3484 - accuracy: 0.8343 Epoch 98/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3504 - accuracy: 0.8325 Epoch 99/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3490 - accuracy: 0.8325 Epoch 100/100 176/176 [==============================] - 0s 1ms/step - loss: 0.3486 - accuracy: 0.8368
<tensorflow.python.keras.callbacks.History at 0x21818af0f10>
model.evaluate(X_test, y_test)
44/44 [==============================] - 0s 1ms/step - loss: 0.4932 - accuracy: 0.7754
[0.4931727349758148, 0.7754086852073669]
yp = model.predict(X_test) yp[:5]
array([[0.25819573], [0.4437274 ], [0.00808946], [0.7649808 ], [0.35091308]], dtype=float32)
y_pred = [] for element in yp: if element > 0.5: y_pred.append(1) else: y_pred.append(0)
y_pred[:10]
[0, 0, 0, 1, 0, 1, 0, 0, 0, 0]
y_test[:10]
2660 0 744 0 5579 1 64 1 3287 1 816 1 2670 0 5920 0 1023 0 6087 0 Name: Churn, dtype: int64
from sklearn.metrics import confusion_matrix , classification_report print(classification_report(y_test,y_pred))
precision recall f1-score support 0 0.83 0.86 0.85 999 1 0.63 0.56 0.59 408 accuracy 0.78 1407 macro avg 0.73 0.71 0.72 1407 weighted avg 0.77 0.78 0.77 1407
import seaborn as sn cm = tf.math.confusion_matrix(labels=y_test,predictions=y_pred) plt.figure(figsize = (10,7)) sn.heatmap(cm, annot=True, fmt='d') plt.xlabel('Predicted') plt.ylabel('Truth')
Text(69.0, 0.5, 'Truth')
Image in a Jupyter notebook
y_test.shape
(1407,)

Accuracy

round((862+229)/(862+229+137+179),2)
0.78

Precision for 0 class. i.e. Precision for customers who did not churn

round(862/(862+179),2)
0.83

Precision for 1 class. i.e. Precision for customers who actually churned

round(229/(229+137),2)
0.63

Recall for 0 class

round(862/(862+137),2)
0.86
round(229/(229+179),2)
0.56

Exercise

Take this dataset for bank customer churn prediction : https://www.kaggle.com/barelydedicated/bank-customer-churn-modeling 1) Build a deep learning model to predict churn rate at bank. 2) Once model is built, print classification report and analyze precision, recall and f1-score