Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
suyashi29
GitHub Repository: suyashi29/python-su
Path: blob/master/Machine Learning Ensemble Methods/4 AdaBoost.ipynb
7216 views
Kernel: Python 3 (ipykernel)

Step 1: total samples = 500 weight = 1/500

Step2: Train First weak Learner Age: age<20 - Class Young age>=20: Class Adult

17-Adult 23-Young

Step3: Error = weighted sum of misclassified points 1-strong 2-weak 3-strong total=.9

Step4: Compute Model Importance: Each weak learner gets a vote weight: if error=.9 alpha = 1/2ln(1.3/.9) alpha =

step5: Update obserations: wcorrected - wi*e**(-alpha)

step6: Train next weak learner i ijertaions

AdaBoost stands for Adaptive Boosting.

It's a boosting ensemble method that:

  • Combines many weak learners (usually shallow decision trees)

  • Builds them one after another

  • Each new model focuses more on the mistakes made by earlier models

  • Produces a strong final classifier through weighted voting

Key Concept (In Very Simple Terms)

  • Start with equal weights on all training samples.

  • Train a weak learner (e.g., a decision stump = depth‑1 tree).

  • Increase the weights of misclassified samples.

  • Train the next learner with these updated weights.

  • Combine all learners using a weighted vote (stronger models get higher weights).

Result: A powerful model that focuses on hard‑to-classify points.

Visual Analogy

Imagine a teacher testing students:

  • First test → some students struggle.

  • Teacher focuses more on those weak areas → next test.

  • Again focuses on remaining weak topics → next test.

  • After many small tests, the teacher combines all scores → final understanding.

This is exactly AdaBoost’s strategy!

Mathematical

%7BF2BB3327-0467-4B5F-84B3-AAF54D62EE25%7D.png

from sklearn.ensemble import AdaBoostClassifier from sklearn.tree import DecisionTreeClassifier from sklearn.datasets import make_classification from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score import matplotlib.pyplot as plt import numpy as np # Create simple synthetic dataset X, y = make_classification( n_samples=500, n_features=2, n_classes=2, n_redundant=0, n_informative=2, random_state=42 ) # Train-test split X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=42 ) # AdaBoost with decision stumps (depth=1) ada = AdaBoostClassifier( estimator=DecisionTreeClassifier(max_depth=1), n_estimators=50, learning_rate=1.0, random_state=42 ) # Train ada.fit(X_train, y_train) # Predict y_pred = ada.predict(X_test) # Accuracy print("Accuracy:", accuracy_score(y_test, y_pred))
Accuracy: 0.88