Path: blob/master/ML Classification using Python/Understanding Logistic regression.ipynb
4733 views
What is Logistic Regression?
Logistic Regression is a supervised machine learning algorithm used for classification.
It predicts probabilities, and based on those probabilities, it classifies data into categories like:
Hired (1) or Not Hired (0)
Spam or Not Spam
Fraud or Not Fraud
Customer will churn or not
Even though the name contains “regression,” it is actually used for classification, not predicting continuous numbers.
Why Do We Need Logistic Regression?
Because many real-world problems require a YES/NO decision, not a number.
Example: If a candidate has certain skills & experience → Will they be hired?
Logistic Regression helps convert input features into a probability score between 0 and 1. If probability > 0.5 → predict 1 If probability < 0.5 → predict 0
How Logistic Regression Works (Simple Explanation)
Step 1: It first creates a linear equation like Linear Regression:
[ z = w_1x_1 + w_2x_2 + ... + b ]
Example:
[ z = 0.8(\text{Experience}) + 1.5(\text{SkillsScore}) - 5 ]
Step 2: That value z can be any number (positive or negative).
But we want a probability between 0 and 1.
So logistic regression uses the Sigmoid Function:
[ \sigma(z) = \frac{1}{1+e^{-z}} ]
This converts z → probability.
Example:
If z = 3 → sigmoid ≈ 0.95 If z = –2 → sigmoid ≈ 0.12
Step 3: Convert probability into a final prediction
| Probability | Prediction |
|---|---|
| > 0.5 | 1 (Yes/Hired) |
| < 0.5 | 0 (No/Not hired) |
Why Sigmoid Function?
Because it:
✔ Squashes numbers between 0 and 1 ✔ Acts like an S-shaped curve ✔ Helps interpret results easily as probabilities
Logistic Regression Example (Simple)
If the model outputs:
This means:
“The candidate has an 82% chance of being hired.”
So final prediction = 1 (Hired).
When Do We Use Logistic Regression?
Binary Classification
(only two classes: 0 or 1)
Examples:
Will customer buy? (yes/no)
Will patient recover? (yes/no)
Will loan get approved? (yes/no)
Multi-class classification (Softmax logistic regression)
Classifying fruits (Apple, Mango, Banana)
Predicting education level (low/medium/high)
Probability prediction
Logistic regression gives interpretable probabilities, which many ML algorithms do not.
Advantages of Logistic Regression
✔ Simple and easy to interpret ✔ Very fast to train ✔ Works well with small datasets ✔ Outputs probability ✔ Good baseline model before trying advanced ML models
Disadvantages
Assumes linear relationship between features & output Not good for complex data Can struggle with non-linear boundaries Sensitive to outliers and correlated features
Visual Intuition
Think of logistic regression as:
"Drawing a boundary line that separates Class 0 and Class 1.”
Here is a simple, clear, and complete explanation of the Sigmoid Function — perfect for ML learners.
What is the Sigmoid Function?
The sigmoid function is a mathematical function that converts any real number (−∞ to +∞) into a value between 0 and 1.
Because of this, it is widely used in:
✔ Logistic Regression ✔ Neural Networks ✔ Probability prediction
Sigmoid Function Formula
[ \sigma(z) = \frac{1}{1 + e^{-z}} ]
Where:
z = any number (output of linear equation)
e = 2.718 (Euler’s constant)
Why Do We Use Sigmoid?
Because logistic regression needs to convert raw values into probabilities.
Example:
If sigmoid = 0.9 → 90% probability → class 1
If sigmoid = 0.2 → 20% probability → class 0
The sigmoid curve is smooth, continuous, and always outputs between 0 and 1.
Intuition Behind Sigmoid
Let’s think of z as evidence for a class:
If evidence is strong positive → z is large → sigmoid → close to 1
If evidence is negative → z is small → sigmoid → close to 0
If evidence is unclear → z = 0 → sigmoid = 0.5
This is why sigmoid works beautifully for binary classification.
Example Calculations
🔹 Case 1: Large positive value
[ \sigma(5) = \frac{1}{1 + e^{-5}} \approx 0.993 ]
Meaning = 99.3% chance of class 1
🔹 Case 2: Large negative value
[ \sigma(-5) \approx 0.0067 ]
Meaning = 0.67% chance of class 1 → almost class 0
🔹 Case 3: Zero
[ \sigma(0) = 0.5 ]
Meaning = 50% probability → unclear → boundary
Sigmoid Curve (Shape)
The graph is S-shaped:
Starts near 0
Rises smoothly
Ends near 1
This is why sigmoid is also called a:
S-curve
Logistic function
Important Properties
1️⃣ Output always between 0 and 1
Perfect for probability.
2️⃣ Smooth and differentiable
Helps optimization algorithms (like Gradient Descent).
3️⃣ The curve is steep in the middle
Small changes in z → big change in probability Useful for decision boundaries.
4️⃣ Symmetric around 0
[ \sigma(0) = 0.5 ]
Derivative of Sigmoid (For Training Models)
[ \sigma'(z) = \sigma(z) (1 - \sigma(z)) ]
This is super simple — and the reason why logistic regression is mathematically easy to optimize.
Why Logistic Regression Uses Sigmoid
Because logistic regression wants to answer:
“What is the probability that this sample belongs to class 1?”
Sigmoid converts the linear equation into a probability. Then we classify:
0.5 → Class 1
< 0.5 → Class 0
Real-World Interpretation Example
Suppose logistic regression outputs:
Sigmoid:
[ \sigma(2.8) \approx 0.94 ]
Meaning:
“There is a 94% chance that the candidate will be hired.”