CoCalc -- churn.ipynb

GitHub Repository: codebasics/deep-learning-keras-tf-tutorial
Path: blob/master/11_chrun_prediction/churn.ipynb
¹¹⁴¹ views

Kernel: Python 3

Customer Churn Prediction Using Artificial Neural Network (ANN)

Customer churn prediction is to measure why customers are leaving a business. In this tutorial we will be looking at customer churn in telecom business. We will build a deep learning model to predict the churn and use precision,recall, f1-score to measure performance of our model

In [251]:

import pandas as pd
from matplotlib import pyplot as plt
import numpy as np
%matplotlib inline

Load the data

In [252]:

df = pd.read_csv("customer_churn.csv")
df.sample(5)

Out[252]:

First of all, drop customerID column as it is of no use

In [253]:

df.drop('customerID',axis='columns',inplace=True)

In [254]:

df.dtypes

Out[254]:

gender               object
SeniorCitizen         int64
Partner              object
Dependents           object
tenure                int64
PhoneService         object
MultipleLines        object
InternetService      object
OnlineSecurity       object
OnlineBackup         object
DeviceProtection     object
TechSupport          object
StreamingTV          object
StreamingMovies      object
Contract             object
PaperlessBilling     object
PaymentMethod        object
MonthlyCharges      float64
TotalCharges         object
Churn                object
dtype: object

Quick glance at above makes me realize that TotalCharges should be float but it is an object. Let's check what's going on with this column

In [255]:

df.TotalCharges.values

Out[255]:

array(['29.85', '1889.5', '108.15', ..., '346.45', '306.6', '6844.5'],
      dtype=object)

Ahh... it is string. Lets convert it to numbers

In [256]:

pd.to_numeric(df.TotalCharges)

Out[256]:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
pandas\_libs\lib.pyx in pandas._libs.lib.maybe_convert_numeric()
ValueError: Unable to parse string " "

During handling of the above exception, another exception occurred:
ValueError                                Traceback (most recent call last)
<ipython-input-256-06ba430a4ba5> in <module>
----> 1 pd.to_numeric(df.TotalCharges)

~\AppData\Roaming\Python\Python38\site-packages\pandas\core\tools\numeric.py in to_numeric(arg, errors, downcast)
    150         coerce_numeric = errors not in ("ignore", "raise")
    151         try:
--> 152             values = lib.maybe_convert_numeric(
    153                 values, set(), coerce_numeric=coerce_numeric
    154             )
pandas\_libs\lib.pyx in pandas._libs.lib.maybe_convert_numeric()
ValueError: Unable to parse string " " at position 488

Hmmm... some values seems to be not numbers but blank string. Let's find out such rows

In [257]:

pd.to_numeric(df.TotalCharges,errors='coerce').isnull()

Out[257]:

     False
     False
     False
     False
     False
        ...  
  False
  False
  False
  False
  False
Name: TotalCharges, Length: 7043, dtype: bool

In [258]:

df[pd.to_numeric(df.TotalCharges,errors='coerce').isnull()]

Out[258]:

In [259]:

df.shape

Out[259]:

(7043, 20)

In [260]:

df.iloc[488].TotalCharges

Out[260]:

' '

In [261]:

df[df.TotalCharges!=' '].shape

Out[261]:

(7032, 20)

Remove rows with space in TotalCharges

In [262]:

df1 = df[df.TotalCharges!=' ']
df1.shape

Out[262]:

(7032, 20)

In [263]:

df1.dtypes

Out[263]:

gender               object
SeniorCitizen         int64
Partner              object
Dependents           object
tenure                int64
PhoneService         object
MultipleLines        object
InternetService      object
OnlineSecurity       object
OnlineBackup         object
DeviceProtection     object
TechSupport          object
StreamingTV          object
StreamingMovies      object
Contract             object
PaperlessBilling     object
PaymentMethod        object
MonthlyCharges      float64
TotalCharges         object
Churn                object
dtype: object

In [264]:

df1.TotalCharges = pd.to_numeric(df1.TotalCharges)

Out[264]:

C:\Users\dhava\AppData\Roaming\Python\Python38\site-packages\pandas\core\generic.py:5159: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self[name] = value

In [265]:

df1.TotalCharges.values

Out[265]:

array([  29.85, 1889.5 ,  108.15, ...,  346.45,  306.6 , 6844.5 ])

In [266]:

df1[df1.Churn=='No']

Out[266]:

Data Visualization

In [271]:

tenure_churn_no = df1[df1.Churn=='No'].tenure
tenure_churn_yes = df1[df1.Churn=='Yes'].tenure

plt.xlabel("tenure")
plt.ylabel("Number Of Customers")
plt.title("Customer Churn Prediction Visualiztion")

blood_sugar_men = [113, 85, 90, 150, 149, 88, 93, 115, 135, 80, 77, 82, 129]
blood_sugar_women = [67, 98, 89, 120, 133, 150, 84, 69, 89, 79, 120, 112, 100]

plt.hist([tenure_churn_yes, tenure_churn_no], rwidth=0.95, color=['green','red'],label=['Churn=Yes','Churn=No'])
plt.legend()

Out[271]:

<matplotlib.legend.Legend at 0x2181d04b700>

In [272]:

mc_churn_no = df1[df1.Churn=='No'].MonthlyCharges      
mc_churn_yes = df1[df1.Churn=='Yes'].MonthlyCharges      

plt.xlabel("Monthly Charges")
plt.ylabel("Number Of Customers")
plt.title("Customer Churn Prediction Visualiztion")

blood_sugar_men = [113, 85, 90, 150, 149, 88, 93, 115, 135, 80, 77, 82, 129]
blood_sugar_women = [67, 98, 89, 120, 133, 150, 84, 69, 89, 79, 120, 112, 100]

plt.hist([mc_churn_yes, mc_churn_no], rwidth=0.95, color=['green','red'],label=['Churn=Yes','Churn=No'])
plt.legend()

Out[272]:

<matplotlib.legend.Legend at 0x2181d15fac0>

Many of the columns are yes, no etc. Let's print unique values in object columns to see data values

In [143]:

def print_unique_col_values(df):
       for column in df:
            if df[column].dtypes=='object':
                print(f'{column}: {df[column].unique()}')

In [144]:

print_unique_col_values(df1)

Out[144]:

gender: ['Female' 'Male']
Partner: ['Yes' 'No']
Dependents: ['No' 'Yes']
PhoneService: ['No' 'Yes']
MultipleLines: ['No phone service' 'No' 'Yes']
InternetService: ['DSL' 'Fiber optic' 'No']
OnlineSecurity: ['No' 'Yes' 'No internet service']
OnlineBackup: ['Yes' 'No' 'No internet service']
DeviceProtection: ['No' 'Yes' 'No internet service']
TechSupport: ['No' 'Yes' 'No internet service']
StreamingTV: ['No' 'Yes' 'No internet service']
StreamingMovies: ['No' 'Yes' 'No internet service']
Contract: ['Month-to-month' 'One year' 'Two year']
PaperlessBilling: ['Yes' 'No']
PaymentMethod: ['Electronic check' 'Mailed check' 'Bank transfer (automatic)'
 'Credit card (automatic)']
Churn: ['No' 'Yes']

Some of the columns have no internet service or no phone service, that can be replaced with a simple No

In [145]:

df1.replace('No internet service','No',inplace=True)
df1.replace('No phone service','No',inplace=True)

Out[145]:

C:\Users\dhava\AppData\Roaming\Python\Python38\site-packages\pandas\core\frame.py:4373: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().replace(

In [146]:

print_unique_col_values(df1)

Out[146]:

gender: ['Female' 'Male']
Partner: ['Yes' 'No']
Dependents: ['No' 'Yes']
PhoneService: ['No' 'Yes']
MultipleLines: ['No' 'Yes']
InternetService: ['DSL' 'Fiber optic' 'No']
OnlineSecurity: ['No' 'Yes']
OnlineBackup: ['Yes' 'No']
DeviceProtection: ['No' 'Yes']
TechSupport: ['No' 'Yes']
StreamingTV: ['No' 'Yes']
StreamingMovies: ['No' 'Yes']
Contract: ['Month-to-month' 'One year' 'Two year']
PaperlessBilling: ['Yes' 'No']
PaymentMethod: ['Electronic check' 'Mailed check' 'Bank transfer (automatic)'
 'Credit card (automatic)']
Churn: ['No' 'Yes']

Convert Yes and No to 1 or 0

In [147]:

yes_no_columns = ['Partner','Dependents','PhoneService','MultipleLines','OnlineSecurity','OnlineBackup',
                  'DeviceProtection','TechSupport','StreamingTV','StreamingMovies','PaperlessBilling','Churn']
for col in yes_no_columns:
    df1[col].replace({'Yes': 1,'No': 0},inplace=True)

Out[147]:

C:\Users\dhava\AppData\Roaming\Python\Python38\site-packages\pandas\core\series.py:4563: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().replace(

In [148]:

for col in df1:
    print(f'{col}: {df1[col].unique()}')

Out[148]:

gender: ['Female' 'Male']
SeniorCitizen: [0 1]
Partner: [1 0]
Dependents: [0 1]
tenure: [ 1 34  2 45  8 22 10 28 62 13 16 58 49 25 69 52 71 21 12 30 47 72 17 27
  5 46 11 70 63 43 15 60 18 66  9  3 31 50 64 56  7 42 35 48 29 65 38 68
 32 55 37 36 41  6  4 33 67 23 57 61 14 20 53 40 59 24 44 19 54 51 26 39]
PhoneService: [0 1]
MultipleLines: [0 1]
InternetService: ['DSL' 'Fiber optic' 'No']
OnlineSecurity: [0 1]
OnlineBackup: [1 0]
DeviceProtection: [0 1]
TechSupport: [0 1]
StreamingTV: [0 1]
StreamingMovies: [0 1]
Contract: ['Month-to-month' 'One year' 'Two year']
PaperlessBilling: [1 0]
PaymentMethod: ['Electronic check' 'Mailed check' 'Bank transfer (automatic)'
 'Credit card (automatic)']
MonthlyCharges: [29.85 56.95 53.85 ... 63.1  44.2  78.7 ]
TotalCharges: [  29.85 1889.5   108.15 ...  346.45  306.6  6844.5 ]
Churn: [0 1]

In [149]:

df1['gender'].replace({'Female':1,'Male':0},inplace=True)

In [150]:

df1.gender.unique()

Out[150]:

array([1, 0], dtype=int64)

One hot encoding for categorical columns

In [151]:

df2 = pd.get_dummies(data=df1, columns=['InternetService','Contract','PaymentMethod'])
df2.columns

Out[151]:

Index(['gender', 'SeniorCitizen', 'Partner', 'Dependents', 'tenure',
       'PhoneService', 'MultipleLines', 'OnlineSecurity', 'OnlineBackup',
       'DeviceProtection', 'TechSupport', 'StreamingTV', 'StreamingMovies',
       'PaperlessBilling', 'MonthlyCharges', 'TotalCharges', 'Churn',
       'InternetService_DSL', 'InternetService_Fiber optic',
       'InternetService_No', 'Contract_Month-to-month', 'Contract_One year',
       'Contract_Two year', 'PaymentMethod_Bank transfer (automatic)',
       'PaymentMethod_Credit card (automatic)',
       'PaymentMethod_Electronic check', 'PaymentMethod_Mailed check'],
      dtype='object')

In [152]:

df2.sample(5)

Out[152]:

In [153]:

df2.dtypes

Out[153]:

gender                                       int64
SeniorCitizen                                int64
Partner                                      int64
Dependents                                   int64
tenure                                       int64
PhoneService                                 int64
MultipleLines                                int64
OnlineSecurity                               int64
OnlineBackup                                 int64
DeviceProtection                             int64
TechSupport                                  int64
StreamingTV                                  int64
StreamingMovies                              int64
PaperlessBilling                             int64
MonthlyCharges                             float64
TotalCharges                               float64
Churn                                        int64
InternetService_DSL                          uint8
InternetService_Fiber optic                  uint8
InternetService_No                           uint8
Contract_Month-to-month                      uint8
Contract_One year                            uint8
Contract_Two year                            uint8
PaymentMethod_Bank transfer (automatic)      uint8
PaymentMethod_Credit card (automatic)        uint8
PaymentMethod_Electronic check               uint8
PaymentMethod_Mailed check                   uint8
dtype: object

In [154]:

cols_to_scale = ['tenure','MonthlyCharges','TotalCharges']

from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
df2[cols_to_scale] = scaler.fit_transform(df2[cols_to_scale])

In [160]:

for col in df2:
    print(f'{col}: {df2[col].unique()}')

Out[160]:

gender: [1 0]
SeniorCitizen: [0 1]
Partner: [1 0]
Dependents: [0 1]
tenure: [0.         0.46478873 0.01408451 0.61971831 0.09859155 0.29577465
 0.12676056 0.38028169 0.85915493 0.16901408 0.21126761 0.8028169
 0.67605634 0.33802817 0.95774648 0.71830986 0.98591549 0.28169014
 0.15492958 0.4084507  0.64788732 1.         0.22535211 0.36619718
 0.05633803 0.63380282 0.14084507 0.97183099 0.87323944 0.5915493
 0.1971831  0.83098592 0.23943662 0.91549296 0.11267606 0.02816901
 0.42253521 0.69014085 0.88732394 0.77464789 0.08450704 0.57746479
 0.47887324 0.66197183 0.3943662  0.90140845 0.52112676 0.94366197
 0.43661972 0.76056338 0.50704225 0.49295775 0.56338028 0.07042254
 0.04225352 0.45070423 0.92957746 0.30985915 0.78873239 0.84507042
 0.18309859 0.26760563 0.73239437 0.54929577 0.81690141 0.32394366
 0.6056338  0.25352113 0.74647887 0.70422535 0.35211268 0.53521127]
PhoneService: [0 1]
MultipleLines: [0 1]
OnlineSecurity: [0 1]
OnlineBackup: [1 0]
DeviceProtection: [0 1]
TechSupport: [0 1]
StreamingTV: [0 1]
StreamingMovies: [0 1]
PaperlessBilling: [1 0]
MonthlyCharges: [0.11542289 0.38507463 0.35422886 ... 0.44626866 0.25820896 0.60149254]
TotalCharges: [0.0012751  0.21586661 0.01031041 ... 0.03780868 0.03321025 0.78764136]
Churn: [0 1]
InternetService_DSL: [1 0]
InternetService_Fiber optic: [0 1]
InternetService_No: [0 1]
Contract_Month-to-month: [1 0]
Contract_One year: [0 1]
Contract_Two year: [0 1]
PaymentMethod_Bank transfer (automatic): [0 1]
PaymentMethod_Credit card (automatic): [0 1]
PaymentMethod_Electronic check: [1 0]
PaymentMethod_Mailed check: [0 1]

Train test split

In [161]:

X = df2.drop('Churn',axis='columns')
y = df2['Churn']

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2,random_state=5)

In [162]:

X_train.shape

Out[162]:

(5625, 26)

In [163]:

X_test.shape

Out[163]:

(1407, 26)

In [207]:

X_train[:10]

Out[207]:

In [166]:

len(X_train.columns)

Out[166]:

26

Build a model (ANN) in tensorflow/keras

In [208]:

import tensorflow as tf
from tensorflow import keras


model = keras.Sequential([
    keras.layers.Dense(26, input_shape=(26,), activation='relu'),
    keras.layers.Dense(15, activation='relu'),
    keras.layers.Dense(1, activation='sigmoid')
])

# opt = keras.optimizers.Adam(learning_rate=0.01)

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

model.fit(X_train, y_train, epochs=100)

Out[208]:

Epoch 1/100
176/176 [==============================] - 0s 1ms/step - loss: 0.4822 - accuracy: 0.7623
Epoch 2/100
176/176 [==============================] - 0s 1ms/step - loss: 0.4269 - accuracy: 0.8000
Epoch 3/100
176/176 [==============================] - 0s 1ms/step - loss: 0.4182 - accuracy: 0.7984
Epoch 4/100
176/176 [==============================] - 0s 1ms/step - loss: 0.4153 - accuracy: 0.8046
Epoch 5/100
176/176 [==============================] - 0s 1ms/step - loss: 0.4127 - accuracy: 0.8078
Epoch 6/100
176/176 [==============================] - 0s 1ms/step - loss: 0.4108 - accuracy: 0.8073
Epoch 7/100
176/176 [==============================] - 0s 1ms/step - loss: 0.4084 - accuracy: 0.8057
Epoch 8/100
176/176 [==============================] - 0s 1ms/step - loss: 0.4070 - accuracy: 0.8108
Epoch 9/100
176/176 [==============================] - 0s 1ms/step - loss: 0.4059 - accuracy: 0.8107
Epoch 10/100
176/176 [==============================] - 0s 1ms/step - loss: 0.4043 - accuracy: 0.8107
Epoch 11/100
176/176 [==============================] - 0s 1ms/step - loss: 0.4037 - accuracy: 0.8110
Epoch 12/100
176/176 [==============================] - 0s 1ms/step - loss: 0.4020 - accuracy: 0.8114
Epoch 13/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3996 - accuracy: 0.8128
Epoch 14/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3992 - accuracy: 0.8132
Epoch 15/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3982 - accuracy: 0.8119
Epoch 16/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3973 - accuracy: 0.8105
Epoch 17/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3955 - accuracy: 0.8128
Epoch 18/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3939 - accuracy: 0.8126
Epoch 19/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3936 - accuracy: 0.8149
Epoch 20/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3930 - accuracy: 0.8155
Epoch 21/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3920 - accuracy: 0.8151
Epoch 22/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3912 - accuracy: 0.8148
Epoch 23/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3896 - accuracy: 0.8162
Epoch 24/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3897 - accuracy: 0.8162
Epoch 25/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3876 - accuracy: 0.8174
Epoch 26/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3864 - accuracy: 0.8187
Epoch 27/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3864 - accuracy: 0.8172
Epoch 28/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3846 - accuracy: 0.8181
Epoch 29/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3846 - accuracy: 0.8172
Epoch 30/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3834 - accuracy: 0.8187
Epoch 31/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3812 - accuracy: 0.8197
Epoch 32/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3815 - accuracy: 0.8180
Epoch 33/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3811 - accuracy: 0.8199
Epoch 34/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3806 - accuracy: 0.8178
Epoch 35/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3799 - accuracy: 0.8219
Epoch 36/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3787 - accuracy: 0.8185
Epoch 37/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3775 - accuracy: 0.8236
Epoch 38/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3783 - accuracy: 0.8212
Epoch 39/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3769 - accuracy: 0.8229
Epoch 40/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3760 - accuracy: 0.8224
Epoch 41/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3757 - accuracy: 0.8199
Epoch 42/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3749 - accuracy: 0.8260
Epoch 43/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3738 - accuracy: 0.8238
Epoch 44/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3727 - accuracy: 0.8228
Epoch 45/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3725 - accuracy: 0.8242
Epoch 46/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3722 - accuracy: 0.8245
Epoch 47/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3718 - accuracy: 0.8252
Epoch 48/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3716 - accuracy: 0.8244
Epoch 49/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3706 - accuracy: 0.8240
Epoch 50/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3703 - accuracy: 0.8224
Epoch 51/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3682 - accuracy: 0.8279
Epoch 52/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3695 - accuracy: 0.8233
Epoch 53/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3678 - accuracy: 0.8251
Epoch 54/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3671 - accuracy: 0.8261
Epoch 55/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3666 - accuracy: 0.8251
Epoch 56/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3656 - accuracy: 0.8251
Epoch 57/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3650 - accuracy: 0.8263
Epoch 58/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3643 - accuracy: 0.8268
Epoch 59/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3646 - accuracy: 0.8284
Epoch 60/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3641 - accuracy: 0.8252
Epoch 61/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3639 - accuracy: 0.8228
Epoch 62/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3630 - accuracy: 0.8299
Epoch 63/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3617 - accuracy: 0.8277
Epoch 64/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3622 - accuracy: 0.8284
Epoch 65/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3615 - accuracy: 0.8276
Epoch 66/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3623 - accuracy: 0.8263
Epoch 67/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3603 - accuracy: 0.8281
Epoch 68/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3600 - accuracy: 0.8284
Epoch 69/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3602 - accuracy: 0.8293
Epoch 70/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3596 - accuracy: 0.8288
Epoch 71/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3587 - accuracy: 0.8276
Epoch 72/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3585 - accuracy: 0.8290
Epoch 73/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3581 - accuracy: 0.8277
Epoch 74/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3582 - accuracy: 0.8311
Epoch 75/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3573 - accuracy: 0.8272
Epoch 76/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3575 - accuracy: 0.8277
Epoch 77/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3573 - accuracy: 0.8306
Epoch 78/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3564 - accuracy: 0.8288
Epoch 79/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3550 - accuracy: 0.8313
Epoch 80/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3550 - accuracy: 0.8324
Epoch 81/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3548 - accuracy: 0.8284
Epoch 82/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3552 - accuracy: 0.8329
Epoch 83/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3556 - accuracy: 0.8279
Epoch 84/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3534 - accuracy: 0.8331
Epoch 85/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3533 - accuracy: 0.8299
Epoch 86/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3536 - accuracy: 0.8332
Epoch 87/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3536 - accuracy: 0.8325
Epoch 88/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3505 - accuracy: 0.8356
Epoch 89/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3517 - accuracy: 0.8311
Epoch 90/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3513 - accuracy: 0.8313
Epoch 91/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3525 - accuracy: 0.8309
Epoch 92/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3495 - accuracy: 0.8334
Epoch 93/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3506 - accuracy: 0.8270
Epoch 94/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3495 - accuracy: 0.8350
Epoch 95/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3497 - accuracy: 0.8327
Epoch 96/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3500 - accuracy: 0.8338
Epoch 97/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3484 - accuracy: 0.8343
Epoch 98/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3504 - accuracy: 0.8325
Epoch 99/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3490 - accuracy: 0.8325
Epoch 100/100
176/176 [==============================] - 0s 1ms/step - loss: 0.3486 - accuracy: 0.8368

<tensorflow.python.keras.callbacks.History at 0x21818af0f10>

In [209]:

model.evaluate(X_test, y_test)

Out[209]:

44/44 [==============================] - 0s 1ms/step - loss: 0.4932 - accuracy: 0.7754

[0.4931727349758148, 0.7754086852073669]

In [210]:

yp = model.predict(X_test)
yp[:5]

Out[210]:

array([[0.25819573],
       [0.4437274 ],
       [0.00808946],
       [0.7649808 ],
       [0.35091308]], dtype=float32)

In [213]:

y_pred = []
for element in yp:
    if element > 0.5:
        y_pred.append(1)
    else:
        y_pred.append(0)

In [218]:

y_pred[:10]

Out[218]:

[0, 0, 0, 1, 0, 1, 0, 0, 0, 0]

In [219]:

y_test[:10]

Out[219]:

  0
   0
  1
    1
  1
   1
  0
  0
  0
  0
Name: Churn, dtype: int64

In [217]:

from sklearn.metrics import confusion_matrix , classification_report

print(classification_report(y_test,y_pred))

Out[217]:

              precision    recall  f1-score   support

           0       0.83      0.86      0.85       999
           1       0.63      0.56      0.59       408

    accuracy                           0.78      1407
   macro avg       0.73      0.71      0.72      1407
weighted avg       0.77      0.78      0.77      1407

In [222]:

import seaborn as sn
cm = tf.math.confusion_matrix(labels=y_test,predictions=y_pred)

plt.figure(figsize = (10,7))
sn.heatmap(cm, annot=True, fmt='d')
plt.xlabel('Predicted')
plt.ylabel('Truth')

Out[222]:

Text(69.0, 0.5, 'Truth')

In [224]:

y_test.shape

Out[224]:

(1407,)

Accuracy

In [235]:

round((862+229)/(862+229+137+179),2)

Out[235]:

0.78

Precision for 0 class. i.e. Precision for customers who did not churn

In [240]:

round(862/(862+179),2)

Out[240]:

0.83

Precision for 1 class. i.e. Precision for customers who actually churned

In [242]:

round(229/(229+137),2)

Out[242]:

0.63

Recall for 0 class

In [243]:

round(862/(862+137),2)

Out[243]:

0.86

In [244]:

round(229/(229+179),2)

Out[244]:

0.56

Exercise

In [ ]:

Take this dataset for bank customer churn prediction : https://www.kaggle.com/barelydedicated/bank-customer-churn-modeling

1) Build a deep learning model to predict churn rate at bank. 

2) Once model is built, print classification report and analyze precision, recall and f1-score

Customer Churn Prediction Using Artificial Neural Network (ANN)

Product

Resources

Company