Path: blob/master/ML/Notebook/Introduction to Ensembling methods.ipynb
3087 views
Ensembling Methods
Ensemble is a group producing a single effect.
For example : If I want to purchase a house ,there can be different websites(Quicker,OLX,Housing etc) I can refer to , or I can consult from agents ,or I can refer to a friend.
Ensembling refers to making decisions by comparing or looking into features of different models and then combining the decisions from multiple models to take best decision which increases the overall performance.
Basic Ensembelling Techniques
Max Voting
The max voting method is generally used for classification problems. In this technique, multiple models are used to make predictions for each data point. The predictions by each model are considered as a ‘vote’. The predictions which we get from the majority of the models are used as the final prediction.
Sample Algo
Averaging
In this method, we take an average of predictions from all the models and use it to make the final prediction. Averaging can be used for making predictions in regression problems or while calculating probabilities for classification problems.
Sample Algo
Weighted Average
This is an extension of the averaging method. All models are assigned different weights defining the importance of each model for prediction.
Sample Algo
Advanced Esembling methods
Bagging
In this method we combine the results of multiple models to get a generalized result.
Bagging is short for bootstrap aggregation, so named because it takes a number of samples from the dataset, with each sample set being regarded as a bootstrap sample.
Operates via equal weighting of models and the results of these bootstrap samples are then aggregated.
Key Features
OOB Error Estimation
Out-of-bag (OOB) error, also called out-of-bag estimate, is a method of measuring the prediction error of random forests, boosted decision trees, and other machine learning models utilizing bootstrap aggregating (bagging) to sub-sample data samples used for training.
Bagging algorithms:
Boosting
This method is quite same as Bagging but the only difference is here that boosting assigns varying weights to classifiers, and derives its ultimate result based on weighted voting.
Operates via weighted voting
Key Features
Boosting algorithms:
Stacking
This method trains multiple single classifiers, as opposed to various incarnations of the same learner. While bagging and boosting would use numerous models built using various instances of the same classification algorithm (eg. decision tree), stacking builds its models using different classification algorithms (perhaps decision trees, logistic regression, or some other combination).
A combiner algorithm is then trained to make ultimate predictions using the predictions of other algorithms. This combiner can be any ensemble technique, but logistic regression is often found to be an adequate and simple algorithm to perform this combining.
Along with classification, stacking can also be employed in unsupervised learning tasks such as density estimation.