Scikit-learn transformers and pipelines¶
Skforecast has two arguments in all the forecasters that allow more detailed control over input data transformations. This feature is particularly useful as many machine learning models require specific data pre-processing transformations. For example, linear models may benefit from features being scaled, or categorical features being transformed into numerical values.
transformer_y
: an instance of a transformer (preprocessor) compatible with the scikit-learn preprocessing API with the methods: fit, transform, fit_transform and, inverse_transform. Scikit-learn ColumnTransformer is not allowed since they do not have the inverse_transform method.transformer_exog
: an instance of a transformer (preprocessor) compatible with the scikit-learn preprocessing API. Scikit-learn ColumnTransformer can be used if the preprocessing transformations only apply to some specific columns or if different transformations are needed for different columns. For example, scale numeric features and one hot encode categorical ones.
Transformations are learned and applied before training the forecaster and are automatically used when calling predict
. The output of predict
is always on the same scale as the original series y.
Although skforecast has allowed using scikit-learn pipelines as regressors since version 0.4.0, it is recommended to use transformer_y
and transformer_exog
instead.