CoCalc -- 4 AdaBoost.ipynb

GitHub Repository: suyashi29/python-su
Path: blob/master/Machine Learning Ensemble Methods/4 AdaBoost.ipynb
⁷²¹⁶ views

Kernel: Python 3 (ipykernel)

Step 1: total samples = 500 weight = 1/500

Step2: Train First weak Learner Age: age<20 - Class Young age>=20: Class Adult

17-Adult 23-Young

Step3: Error = weighted sum of misclassified points 1-strong 2-weak 3-strong total=.9

Step4: Compute Model Importance: Each weak learner gets a vote weight: if error=.9 alpha = 1/2ln(1.3/.9) alpha =

step5: Update obserations: wcorrected - wi*e**(-alpha)

step6: Train next weak learner i ijertaions

AdaBoost stands for Adaptive Boosting.

It's a boosting ensemble method that:

Combines many weak learners (usually shallow decision trees)
Builds them one after another
Each new model focuses more on the mistakes made by earlier models
Produces a strong final classifier through weighted voting

Key Concept (In Very Simple Terms)

Start with equal weights on all training samples.
Train a weak learner (e.g., a decision stump = depth‑1 tree).
Increase the weights of misclassified samples.
Train the next learner with these updated weights.
Combine all learners using a weighted vote (stronger models get higher weights).

Result: A powerful model that focuses on hard‑to-classify points.

Visual Analogy

Imagine a teacher testing students:

First test → some students struggle.
Teacher focuses more on those weak areas → next test.
Again focuses on remaining weak topics → next test.
After many small tests, the teacher combines all scores → final understanding.

This is exactly AdaBoost’s strategy!

Mathematical

In [1]:

from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt
import numpy as np

# Create simple synthetic dataset
X, y = make_classification(
    n_samples=500, n_features=2, n_classes=2, 
    n_redundant=0, n_informative=2, random_state=42
)

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# AdaBoost with decision stumps (depth=1)
ada = AdaBoostClassifier(
    estimator=DecisionTreeClassifier(max_depth=1),
    n_estimators=50,
    learning_rate=1.0,
    random_state=42
)

# Train
ada.fit(X_train, y_train)

# Predict
y_pred = ada.predict(X_test)

# Accuracy
print("Accuracy:", accuracy_score(y_test, y_pred))

Out[1]:

Accuracy: 0.88

In [ ]:

AdaBoost stands for Adaptive Boosting.

Key Concept (In Very Simple Terms)

Visual Analogy

Mathematical

Product

Resources

Company