Understanding the forecaster attributes¶

During the process of creating and training a forecaster, the object stores a lot of information in its attributes that can be useful to the user. We will explore the main attributes included in a ForecasterRecursive, but this can be extrapolated to any of the skforecast forecasters.

Create and train a forecaster¶

To be able to create and train a forecaster, at least regressor and lags and/or window_features must be specified.

In [1]:

Copied!





# Libraries
# ==============================================================================
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from skforecast.datasets import load_demo_dataset
from skforecast.preprocessing import RollingFeatures
from skforecast.recursive import ForecasterRecursive
# Libraries
# ==============================================================================
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from skforecast.datasets import load_demo_dataset
from skforecast.preprocessing import RollingFeatures
from skforecast.recursive import ForecasterRecursive

In [2]:

Copied!





# Download data
# ==============================================================================
y = load_demo_dataset()

# Create and fit forecaster
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor       = RandomForestRegressor(random_state=123),
                 lags            = 5,
                 window_features = RollingFeatures(stats=['mean'], window_sizes=[7])
             )

forecaster.fit(y=y)
forecaster
# Download data
# ==============================================================================
y = load_demo_dataset()

# Create and fit forecaster
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor       = RandomForestRegressor(random_state=123),
                 lags            = 5,
                 window_features = RollingFeatures(stats=['mean'], window_sizes=[7])
             )

forecaster.fit(y=y)
forecaster

Out[2]:

ForecasterRecursive

General Information

Regressor: RandomForestRegressor
Lags: [1 2 3 4 5]
Window features: ['roll_mean_7']
Window size: 7
Exogenous included: False
Weight function included: False
Differentiation order: None
Creation date: 2024-11-10 20:29:26
Last fit date: 2024-11-10 20:29:26
Skforecast version: 0.14.0
Python version: 3.11.10
Forecaster id: None

Exogenous Variables

None

Data Transformations

Transformer for y: None
Transformer for exog: None

Training Information

Training range: [Timestamp('1991-07-01 00:00:00'), Timestamp('2008-06-01 00:00:00')]
Training index type: DatetimeIndex
Training index frequency: MS

Regressor Parameters

{'bootstrap': True, 'ccp_alpha': 0.0, 'criterion': 'squared_error', 'max_depth': None, 'max_features': 1.0, 'max_leaf_nodes': None, 'max_samples': None, 'min_impurity_decrease': 0.0, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'monotonic_cst': None, 'n_estimators': 100, 'n_jobs': None, 'oob_score': False, 'random_state': 123, 'verbose': 0, 'warm_start': False}

Fit Kwargs

{}

🛈 API Reference 🗎 User Guide

In [3]:

Copied!





# List of attributes
# ==============================================================================
for attribute, value in forecaster.__dict__.items():
    print(attribute)
# List of attributes
# ==============================================================================
for attribute, value in forecaster.__dict__.items():
    print(attribute)

regressor
transformer_y
transformer_exog
weight_func
source_code_weight_func
differentiation
differentiator
last_window_
index_type_
index_freq_
training_range_
exog_in_
exog_names_in_
exog_type_in_
exog_dtypes_in_
X_train_window_features_names_out_
X_train_exog_names_out_
X_train_features_names_out_
in_sample_residuals_
out_sample_residuals_
in_sample_residuals_by_bin_
out_sample_residuals_by_bin_
creation_date
is_fitted
fit_date
skforecast_version
python_version
forecaster_id
lags
lags_names
max_lag
window_features
window_features_names
max_size_window_features
window_size
window_features_class_names
binner_kwargs
binner
binner_intervals_
fit_kwargs

Regressor¶

Skforecast is a Python library that facilitates using scikit-learn regressors as multi-step forecasters and also works with any regressor compatible with the scikit-learn API.

In [4]:

Copied!

# Forecaster regressor
# ==============================================================================
forecaster.regressor
# Forecaster regressor
# ==============================================================================
forecaster.regressor

Out[4]:

RandomForestRegressor(random_state=123)

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

In [5]:

Copied!

# Show regressor parameters
# ==============================================================================
forecaster.regressor.get_params(deep=True)
# Show regressor parameters
# ==============================================================================
forecaster.regressor.get_params(deep=True)

Out[5]:

{'bootstrap': True,
 'ccp_alpha': 0.0,
 'criterion': 'squared_error',
 'max_depth': None,
 'max_features': 1.0,
 'max_leaf_nodes': None,
 'max_samples': None,
 'min_impurity_decrease': 0.0,
 'min_samples_leaf': 1,
 'min_samples_split': 2,
 'min_weight_fraction_leaf': 0.0,
 'monotonic_cst': None,
 'n_estimators': 100,
 'n_jobs': None,
 'oob_score': False,
 'random_state': 123,
 'verbose': 0,
 'warm_start': False}

✎ Note

In the forecasters that follows a Direct Strategy (ForecasterDirect and ForecasterDirectMultiVariate), one instance of the regressor is trained for each step. All of them are stored in self.regressors_

Lags¶

Lags used as predictors. Index starts at 1, so lag 1 is equal to t-1.

In [6]:

Copied!

# Forecaster lags
# ==============================================================================
forecaster.lags
# Forecaster lags
# ==============================================================================
forecaster.lags

Out[6]:

array([1, 2, 3, 4, 5])

In [7]:

Copied!





# Lags information
# ==============================================================================
print("Lags names : ", forecaster.lags_names)
print("Max lag    : ", forecaster.max_lag)
# Lags information
# ==============================================================================
print("Lags names : ", forecaster.lags_names)
print("Max lag    : ", forecaster.max_lag)

Lags names :  ['lag_1', 'lag_2', 'lag_3', 'lag_4', 'lag_5']
Max lag    :  5

Window features¶

When forecasting time series data, it may be useful to consider additional characteristics beyond just the lagged values. For example, the moving average of the previous n values may help to capture the trend in the series. The window_features argument allows the inclusion of additional predictors created with the previous values of the series.

In [8]:

Copied!

# Forecaster window features
# ==============================================================================
forecaster.window_features
# Forecaster window features
# ==============================================================================
forecaster.window_features

Out[8]:

[RollingFeatures(
     stats           = ['mean'],
     window_sizes    = [7],
     Max window size = 7,
     min_periods     = [7],
     features_names  = ['roll_mean_7'],
     fillna          = None
 )]

In [9]:

Copied!





# Window features information
# ==============================================================================
print("Window features names   : ", forecaster.window_features_names)
print("Max window size wf      : ", forecaster.max_size_window_features)
print("Window features classes : ", forecaster.window_features_class_names)
# Window features information
# ==============================================================================
print("Window features names   : ", forecaster.window_features_names)
print("Max window size wf      : ", forecaster.max_size_window_features)
print("Window features classes : ", forecaster.window_features_class_names)

Window features names   :  ['roll_mean_7']
Max window size wf      :  7
Window features classes :  ['RollingFeatures']

Window size¶

The size of the data window needed to create the predictors. It is the maximum between the maximum lag and the maximum window required by the window features.

In [10]:

Copied!





# Forecaster window size
# ==============================================================================
print("Max lag            : ", forecaster.max_lag)
print("Max window size wf : ", forecaster.max_size_window_features)
print("Window size        : ", forecaster.window_size)
# Forecaster window size
# ==============================================================================
print("Max lag            : ", forecaster.max_lag)
print("Max window size wf : ", forecaster.max_size_window_features)
print("Window size        : ", forecaster.window_size)

Max lag            :  5
Max window size wf :  7
Window size        :  7

Last window¶

Last window the forecaster has seen during training. It stores the values needed to predict the next step immediately after the training data.

In [11]:

Copied!

# Forecaster last window
# ==============================================================================
forecaster.last_window_
# Forecaster last window
# ==============================================================================
forecaster.last_window_

Out[11]:

	y
datetime
2007-12-01	1.176589
2008-01-01	1.219941
2008-02-01	0.761822
2008-03-01	0.649435
2008-04-01	0.827887
2008-05-01	0.816255
2008-06-01	0.762137

💡 Tip

Learn how to get your forecasters into production and get the most out of them with last_window. Using forecasting models in production.

In-sample residuals¶

Residuals from models predicting training data. If transformer_series is not None, the residuals are stored in the transformed scale.

✎ Note

In the forecasters that follows a Direct Strategy (ForecasterDirect and ForecasterDirectMultiVariate) and in the Global Forecasting Models: Independent multi-series forecasting (ForecasterRecursiveMultiSeries) this parameter is a dict containing the residuals for each regressor/serie.

In [12]:

Copied!





# Forecaster in-sample residuals
# ==============================================================================
print("Length:", len(forecaster.in_sample_residuals_))
forecaster.in_sample_residuals_[:5]
# Forecaster in-sample residuals
# ==============================================================================
print("Length:", len(forecaster.in_sample_residuals_))
forecaster.in_sample_residuals_[:5]

Length: 197

Out[12]:

array([-0.13137208, -0.03279934, -0.00417238, -0.02469213,  0.00563615])

Out-of-sample residuals¶

Residuals from models predicting non training data. If transformer_y is not None, residuals are assumed to be in the transformed scale. Use set_out_sample_residuals method to set values.

As no values have been added, the parameter is None.

In [13]:

Copied!

# Forecaster out-of-sample residuals
# ==============================================================================
forecaster.out_sample_residuals_
# Forecaster out-of-sample residuals
# ==============================================================================
forecaster.out_sample_residuals_