Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
suyashi29
GitHub Repository: suyashi29/python-su
Path: blob/master/ML Classification using Python/Understanding Logistic regression.ipynb
4733 views
Kernel: Python 3 (ipykernel)

What is Logistic Regression?

Logistic Regression is a supervised machine learning algorithm used for classification.

It predicts probabilities, and based on those probabilities, it classifies data into categories like:

  • Hired (1) or Not Hired (0)

  • Spam or Not Spam

  • Fraud or Not Fraud

  • Customer will churn or not

Even though the name contains “regression,” it is actually used for classification, not predicting continuous numbers.


Why Do We Need Logistic Regression?

Because many real-world problems require a YES/NO decision, not a number.

Example: If a candidate has certain skills & experience → Will they be hired?

Logistic Regression helps convert input features into a probability score between 0 and 1. If probability > 0.5 → predict 1 If probability < 0.5 → predict 0


How Logistic Regression Works (Simple Explanation)

Step 1: It first creates a linear equation like Linear Regression:

[ z = w_1x_1 + w_2x_2 + ... + b ]

Example:

[ z = 0.8(\text{Experience}) + 1.5(\text{SkillsScore}) - 5 ]


Step 2: That value z can be any number (positive or negative).

But we want a probability between 0 and 1.

So logistic regression uses the Sigmoid Function:

[ \sigma(z) = \frac{1}{1+e^{-z}} ]

This converts z → probability.

Example:

If z = 3 → sigmoid ≈ 0.95 If z = –2 → sigmoid ≈ 0.12


Step 3: Convert probability into a final prediction

ProbabilityPrediction
> 0.51 (Yes/Hired)
< 0.50 (No/Not hired)

Why Sigmoid Function?

Because it:

✔ Squashes numbers between 0 and 1 ✔ Acts like an S-shaped curve ✔ Helps interpret results easily as probabilities


Logistic Regression Example (Simple)

If the model outputs:

Probability(Hired) = 0.82

This means:

“The candidate has an 82% chance of being hired.”

So final prediction = 1 (Hired).


When Do We Use Logistic Regression?

Binary Classification

(only two classes: 0 or 1)

Examples:

  • Will customer buy? (yes/no)

  • Will patient recover? (yes/no)

  • Will loan get approved? (yes/no)

Multi-class classification (Softmax logistic regression)

  • Classifying fruits (Apple, Mango, Banana)

  • Predicting education level (low/medium/high)

Probability prediction

Logistic regression gives interpretable probabilities, which many ML algorithms do not.


Advantages of Logistic Regression

✔ Simple and easy to interpret ✔ Very fast to train ✔ Works well with small datasets ✔ Outputs probability ✔ Good baseline model before trying advanced ML models


Disadvantages

Assumes linear relationship between features & output Not good for complex data Can struggle with non-linear boundaries Sensitive to outliers and correlated features


Visual Intuition

Think of logistic regression as:

"Drawing a boundary line that separates Class 0 and Class 1.”

Here is a simple, clear, and complete explanation of the Sigmoid Function — perfect for ML learners.


What is the Sigmoid Function?

The sigmoid function is a mathematical function that converts any real number (−∞ to +∞) into a value between 0 and 1.

Because of this, it is widely used in:

✔ Logistic Regression ✔ Neural Networks ✔ Probability prediction


Sigmoid Function Formula

[ \sigma(z) = \frac{1}{1 + e^{-z}} ]

Where:

  • z = any number (output of linear equation)

  • e = 2.718 (Euler’s constant)


Why Do We Use Sigmoid?

Because logistic regression needs to convert raw values into probabilities.

Example:

  • If sigmoid = 0.9 → 90% probability → class 1

  • If sigmoid = 0.2 → 20% probability → class 0

The sigmoid curve is smooth, continuous, and always outputs between 0 and 1.


Intuition Behind Sigmoid

Let’s think of z as evidence for a class:

  • If evidence is strong positive → z is large → sigmoid → close to 1

  • If evidence is negative → z is small → sigmoid → close to 0

  • If evidence is unclear → z = 0 → sigmoid = 0.5

This is why sigmoid works beautifully for binary classification.


Example Calculations

🔹 Case 1: Large positive value

[ \sigma(5) = \frac{1}{1 + e^{-5}} \approx 0.993 ]

Meaning = 99.3% chance of class 1


🔹 Case 2: Large negative value

[ \sigma(-5) \approx 0.0067 ]

Meaning = 0.67% chance of class 1 → almost class 0


🔹 Case 3: Zero

[ \sigma(0) = 0.5 ]

Meaning = 50% probability → unclear → boundary


Sigmoid Curve (Shape)

The graph is S-shaped:

  • Starts near 0

  • Rises smoothly

  • Ends near 1

This is why sigmoid is also called a:

  • S-curve

  • Logistic function


Important Properties

1️⃣ Output always between 0 and 1

Perfect for probability.

2️⃣ Smooth and differentiable

Helps optimization algorithms (like Gradient Descent).

3️⃣ The curve is steep in the middle

Small changes in z → big change in probability Useful for decision boundaries.

4️⃣ Symmetric around 0

[ \sigma(0) = 0.5 ]


Derivative of Sigmoid (For Training Models)

[ \sigma'(z) = \sigma(z) (1 - \sigma(z)) ]

This is super simple — and the reason why logistic regression is mathematically easy to optimize.


Why Logistic Regression Uses Sigmoid

Because logistic regression wants to answer:

“What is the probability that this sample belongs to class 1?”

Sigmoid converts the linear equation into a probability. Then we classify:

  • 0.5 → Class 1

  • < 0.5 → Class 0


Real-World Interpretation Example

Suppose logistic regression outputs:

z = 2.8

Sigmoid:

[ \sigma(2.8) \approx 0.94 ]

Meaning:

“There is a 94% chance that the candidate will be hired.”