# Stack Machine Learning Models – Get Better Results

Sometimes you discover small tips and tricks to improve your code and make life easier for yourself, e.g. better maintainability, efficiency etc. — well this is one of those improvements to your machine learning, except it’s essential and takes an extra thought to implement.

The goal is to introduce you, the developer, to stacking in machine learning. Using your own models, you will learn how to apply stacking to your own datasets. Follow this article and get better results — it’s that simple.

## What is Model Stacking?

Stacking is the process of using different Machine Learning models one after another, where you add the predictions from each model to make a new feature.

There are generally two different variants for stacking, variant A and B. For this article, we are focusing on variant A, since this seems to get the better results of the two variants, because models more easily overfit to training data in variant B. This is likely also the reason why practitioners use variant A, although it does not eliminate overfitting.

Please note that there is no correct way of implementing model stacking (David H. Wolpert), because model stacking only describes the process of combining many models with a final generalized model. There exists ways to implement model stacking, some of which has been proven to work well in practice. This is why we explore variant A.

### Explaining The Model Stacking Process

Model stacking should always be accompanied by cross validation, to reduce overfitting models to training data. This is common practice – read more here.

Model stacking will seem like a simple technique to improving your results, when you understand what happens inside the algorithm. Though, there is many components interacting, and keeping track of all of them can be quite difficult, especially when first learning this concept. For you to fully understand the algorithm, I created a step-by-step image and description, such that it is easier to understand.

For starters, when doing model stacking with cross-validation, we require three parameters: a Training dataset, Holdout dataset (validation dataset) and a list of models called models_to_train.

The most essential part here, is that each model's predictions becomes a slice of a new feature, such that each model gets to predict a slice of the training data for this new feature.

Now, let's put figure 1 into text, to actually explain what goes on! Later on, you will encounter a real example in Python.

1. Gather models with optim
[...]