Kernel: Python 3 (ipykernel)
Machine Learning with PyTorch and Scikit-Learn
-- Code Examples
Package version checks
Add folder to path in order to load from the check_packages.py script:
In [1]:
Check recommended package versions:
In [2]:
Out[2]:
[OK] Your Python version is 3.10.14 (main, May 6 2024, 14:42:37) [Clang 14.0.6 ]
[OK] numpy 1.26.4
[OK] scipy 1.12.0
[OK] matplotlib 3.9.2
[OK] sklearn 1.6.1
[OK] pandas 2.2.2
[OK] xgboost 2.1.3
Chapter 7 - Combining Different Models for Ensemble Learning
Overview
In [3]:
Learning with ensembles
In [4]:
Out[4]:
In [5]:
Out[5]:
In [6]:
In [7]:
Out[7]:
0.03432750701904297
In [8]:
In [9]:
Out[9]:
Combining classifiers via majority vote
Implementing a simple majority vote classifier
In [10]:
Out[10]:
1
In [11]:
Out[11]:
array([0.58, 0.42])
In [12]:
Out[12]:
0
In [13]:
Using the majority voting principle to make predictions
In [14]:
In [15]:
Out[15]:
10-fold cross validation:
ROC AUC: 0.92 (+/- 0.15) [Logistic regression]
ROC AUC: 0.87 (+/- 0.18) [Decision tree]
ROC AUC: 0.85 (+/- 0.13) [KNN]
In [16]:
Out[16]:
ROC AUC: 0.92 (+/- 0.15) [Logistic regression]
ROC AUC: 0.87 (+/- 0.18) [Decision tree]
ROC AUC: 0.85 (+/- 0.13) [KNN]
ROC AUC: 0.98 (+/- 0.05) [Majority voting]
In [17]:
Out[17]:
ROC AUC: 0.92 (+/- 0.15) [Logistic regression]
ROC AUC: 0.87 (+/- 0.18) [Decision tree]
ROC AUC: 0.85 (+/- 0.13) [KNN]
ROC AUC: 0.98 (+/- 0.05) [Majority voting]
Evaluating and tuning the ensemble classifier
In [17]:
Out[17]:
In [18]:
In [19]:
Out[19]:
In [20]:
Out[20]:
{'pipeline-1': Pipeline(steps=[('sc', StandardScaler()),
['clf', LogisticRegression(C=0.001, random_state=1)]]),
'decisiontreeclassifier': DecisionTreeClassifier(criterion='entropy', max_depth=1, random_state=0),
'pipeline-2': Pipeline(steps=[('sc', StandardScaler()),
['clf', KNeighborsClassifier(n_neighbors=1)]]),
'pipeline-1__memory': None,
'pipeline-1__steps': [('sc', StandardScaler()),
['clf', LogisticRegression(C=0.001, random_state=1)]],
'pipeline-1__verbose': False,
'pipeline-1__sc': StandardScaler(),
'pipeline-1__clf': LogisticRegression(C=0.001, random_state=1),
'pipeline-1__sc__copy': True,
'pipeline-1__sc__with_mean': True,
'pipeline-1__sc__with_std': True,
'pipeline-1__clf__C': 0.001,
'pipeline-1__clf__class_weight': None,
'pipeline-1__clf__dual': False,
'pipeline-1__clf__fit_intercept': True,
'pipeline-1__clf__intercept_scaling': 1,
'pipeline-1__clf__l1_ratio': None,
'pipeline-1__clf__max_iter': 100,
'pipeline-1__clf__multi_class': 'auto',
'pipeline-1__clf__n_jobs': None,
'pipeline-1__clf__penalty': 'l2',
'pipeline-1__clf__random_state': 1,
'pipeline-1__clf__solver': 'lbfgs',
'pipeline-1__clf__tol': 0.0001,
'pipeline-1__clf__verbose': 0,
'pipeline-1__clf__warm_start': False,
'decisiontreeclassifier__ccp_alpha': 0.0,
'decisiontreeclassifier__class_weight': None,
'decisiontreeclassifier__criterion': 'entropy',
'decisiontreeclassifier__max_depth': 1,
'decisiontreeclassifier__max_features': None,
'decisiontreeclassifier__max_leaf_nodes': None,
'decisiontreeclassifier__min_impurity_decrease': 0.0,
'decisiontreeclassifier__min_samples_leaf': 1,
'decisiontreeclassifier__min_samples_split': 2,
'decisiontreeclassifier__min_weight_fraction_leaf': 0.0,
'decisiontreeclassifier__random_state': 0,
'decisiontreeclassifier__splitter': 'best',
'pipeline-2__memory': None,
'pipeline-2__steps': [('sc', StandardScaler()),
['clf', KNeighborsClassifier(n_neighbors=1)]],
'pipeline-2__verbose': False,
'pipeline-2__sc': StandardScaler(),
'pipeline-2__clf': KNeighborsClassifier(n_neighbors=1),
'pipeline-2__sc__copy': True,
'pipeline-2__sc__with_mean': True,
'pipeline-2__sc__with_std': True,
'pipeline-2__clf__algorithm': 'auto',
'pipeline-2__clf__leaf_size': 30,
'pipeline-2__clf__metric': 'minkowski',
'pipeline-2__clf__metric_params': None,
'pipeline-2__clf__n_jobs': None,
'pipeline-2__clf__n_neighbors': 1,
'pipeline-2__clf__p': 2,
'pipeline-2__clf__weights': 'uniform'}
In [21]:
Out[21]:
0.983 +/- 0.05 {'decisiontreeclassifier__max_depth': 1, 'pipeline-1__clf__C': 0.001}
0.983 +/- 0.05 {'decisiontreeclassifier__max_depth': 1, 'pipeline-1__clf__C': 0.1}
0.967 +/- 0.10 {'decisiontreeclassifier__max_depth': 1, 'pipeline-1__clf__C': 100.0}
0.983 +/- 0.05 {'decisiontreeclassifier__max_depth': 2, 'pipeline-1__clf__C': 0.001}
0.983 +/- 0.05 {'decisiontreeclassifier__max_depth': 2, 'pipeline-1__clf__C': 0.1}
0.967 +/- 0.10 {'decisiontreeclassifier__max_depth': 2, 'pipeline-1__clf__C': 100.0}
In [22]:
Out[22]:
Best parameters: {'decisiontreeclassifier__max_depth': 1, 'pipeline-1__clf__C': 0.001}
ROC AUC: 0.98
Note
By default, the default setting for refit
in GridSearchCV
is True
(i.e., GridSeachCV(..., refit=True)
), which means that we can use the fitted GridSearchCV
estimator to make predictions via the predict
method, for example:
In addition, the "best" estimator can directly be accessed via the best_estimator_
attribute.
In [23]:
Out[23]:
[Pipeline(steps=[('sc', StandardScaler()),
['clf', LogisticRegression(C=0.001, random_state=1)]]),
DecisionTreeClassifier(criterion='entropy', max_depth=1, random_state=0),
Pipeline(steps=[('sc', StandardScaler()),
['clf', KNeighborsClassifier(n_neighbors=1)]])]
In [24]:
In [25]:
Out[25]:
MajorityVoteClassifier(classifiers=[Pipeline(steps=[('sc', StandardScaler()),
('clf',
LogisticRegression(C=0.001,
random_state=1))]),
DecisionTreeClassifier(criterion='entropy',
max_depth=1,
random_state=0),
Pipeline(steps=[('sc', StandardScaler()),
('clf',
KNeighborsClassifier(n_neighbors=1))])])
In [26]:
Out[26]:
MajorityVoteClassifier(classifiers=[Pipeline(steps=[('sc', StandardScaler()),
('clf',
LogisticRegression(C=0.001,
random_state=1))]),
DecisionTreeClassifier(criterion='entropy',
max_depth=1,
random_state=0),
Pipeline(steps=[('sc', StandardScaler()),
('clf',
KNeighborsClassifier(n_neighbors=1))])])
Bagging -- Building an ensemble of classifiers from bootstrap samples
In [27]:
Out[27]:
Bagging in a nutshell
In [28]:
Out[28]:
Applying bagging to classify examples in the Wine dataset
In [29]:
In [30]:
In [31]:
In [32]:
Out[32]:
Decision tree train/test accuracies 1.000/0.833
Bagging train/test accuracies 1.000/0.917
In [ ]:
Leveraging weak learners via adaptive boosting
How boosting works
In [35]:
Out[35]:
In [36]:
Out[36]:
In [37]:
Out[37]:
[0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1]
0.3
In [38]:
Out[38]:
0.42364893019360184
In [39]:
Out[39]:
0.06546536707079771
In [40]:
Out[40]:
0.1527525231651947
In [41]:
Out[41]:
0.1527525231651947
In [42]:
Out[42]:
[0.06546537 0.06546537 0.06546537 0.06546537 0.06546537 0.06546537
0.15275252 0.15275252 0.15275252 0.06546537]
In [43]:
Out[43]:
[0.07142857 0.07142857 0.07142857 0.07142857 0.07142857 0.07142857
0.16666667 0.16666667 0.16666667 0.07142857]
Applying AdaBoost using scikit-learn
In [44]:
In [45]:
Out[45]:
Decision tree train/test accuracies 0.916/0.875
AdaBoost train/test accuracies 1.000/0.917
In [47]:
Out[47]:
Gradient boosting -- training an ensemble based on loss gradients
Comparing AdaBoost with gradient boosting
Outlining the general gradient boosting algorithm
Explaining the gradient boosting algorithm for classification
Illustrating gradient boosting for classification
In [46]:
Out[46]:
In [47]:
Out[47]:
In [48]:
Out[48]:
In [49]:
Out[49]:
In [50]:
Out[50]:
Using XGBoost
In [51]:
In [52]:
Out[52]:
'1.5.1'
In [53]:
Out[53]:
[15:17:43] WARNING: /Users/runner/miniforge3/conda-bld/xgboost-split_1643226991592/work/src/learner.cc:1115: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
XGboost train/test accuracies 0.968/0.917
Summary
...
Readers may ignore the next cell.
In [1]:
Out[1]:
[NbConvertApp] Converting notebook ch07.ipynb to script
[NbConvertApp] Writing 24357 bytes to ch07.py
In [ ]: