Path: blob/master/ML Classification using Python/Logistic regression for Customer churn Prediction.ipynb
4732 views
Customer churn prediction
It is a data-driven approach to identify which customers are likely to stop using a product or service within a given time frame. It’s widely used in industries like telecom, banking, SaaS, and retail because retaining customers is often more cost-effective than acquiring new ones.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[12], line 4
2 plt.figure(figsize=(12, 8))
3 for i, col in enumerate(X[:4], 1):
----> 4 plt.subplot(2, 2, i)
5 customer_churn_data[col].hist(bins=30)
6 plt.title(col)
, in subplot(*args, **kwargs)
1320 fig = gcf()
1322 # First, search for an existing subplot with a matching spec.
-> 1323 key = SubplotSpec._from_subplot_args(fig, args)
1325 for ax in fig.axes:
1326 # if we found an Axes at the position sort out if we can re-use it
1327 if ax.get_subplotspec() == key:
1328 # if the user passed no kwargs, re-use
, in SubplotSpec._from_subplot_args(figure, args)
598 else:
599 if not isinstance(num, Integral) or num < 1 or num > rows*cols:
--> 600 raise ValueError(
601 f"num must be an integer with 1 <= num <= {rows*cols}, "
602 f"not {num!r}"
603 )
604 i = j = num
605 return gs[i-1:j]
ValueError: num must be an integer with 1 <= num <= 4, not 5
Hyperparameter tuning grid for Logistic Regression in scikit-learn, typically used with GridSearchCV or RandomizedSearchCV.
Explanation of Each Parameter
clf__penalty: ['l2']
Penalty refers to the type of regularization applied to prevent overfitting. 'l2' = Ridge regularization (most common for Logistic Regression). Other options (not in this grid): 'l1' (Lasso), 'elasticnet'
clf__C: [0.01, 0.1, 1, 10, 100]
C is the inverse of regularization strength. Smaller C → stronger regularization (simpler model). Larger C → weaker regularization (model fits data more closely). This grid tests a wide range from very strong regularization (0.01) to almost no regularization (100)
clf__solver: ['lbfgs', 'liblinear']
Solver is the algorithm used to optimize the logistic regression cost function. 'lbfgs': Good for large datasets, supports L2 penalty. 'liblinear': Works well for small datasets, supports L1 and L2 penalties
Why tune these?
Different penalties and solvers affect convergence and performance.
C controls bias-variance tradeoff.
Choosing the right combination improves accuracy and generalization.