Understanding the forecaster attributes¶
During the process of creating and training a forecaster, the object stores a lot of information in its attributes that can be useful to the user. We will explore the main attributes included in a ForecasterRecursive
, but this can be extrapolated to any of the skforecast forecasters.
Create and train a forecaster¶
To be able to create and train a forecaster, at least regressor
and lags
and/or window_features
must be specified.
# Libraries
# ==============================================================================
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from skforecast.datasets import load_demo_dataset
from skforecast.preprocessing import RollingFeatures
from skforecast.recursive import ForecasterRecursive
# Download data
# ==============================================================================
y = load_demo_dataset()
# Create and fit forecaster
# ==============================================================================
forecaster = ForecasterRecursive(
regressor = RandomForestRegressor(random_state=123),
lags = 5,
window_features = RollingFeatures(stats=['mean'], window_sizes=[7])
)
forecaster.fit(y=y)
forecaster
ForecasterRecursive
General Information
- Regressor: RandomForestRegressor
- Lags: [1 2 3 4 5]
- Window features: ['roll_mean_7']
- Window size: 7
- Exogenous included: False
- Weight function included: False
- Differentiation order: None
- Creation date: 2024-11-10 20:29:26
- Last fit date: 2024-11-10 20:29:26
- Skforecast version: 0.14.0
- Python version: 3.11.10
- Forecaster id: None
Exogenous Variables
-
None
Data Transformations
- Transformer for y: None
- Transformer for exog: None
Training Information
- Training range: [Timestamp('1991-07-01 00:00:00'), Timestamp('2008-06-01 00:00:00')]
- Training index type: DatetimeIndex
- Training index frequency: MS
Regressor Parameters
-
{'bootstrap': True, 'ccp_alpha': 0.0, 'criterion': 'squared_error', 'max_depth': None, 'max_features': 1.0, 'max_leaf_nodes': None, 'max_samples': None, 'min_impurity_decrease': 0.0, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'monotonic_cst': None, 'n_estimators': 100, 'n_jobs': None, 'oob_score': False, 'random_state': 123, 'verbose': 0, 'warm_start': False}
Fit Kwargs
-
{}
# List of attributes
# ==============================================================================
for attribute, value in forecaster.__dict__.items():
print(attribute)
regressor transformer_y transformer_exog weight_func source_code_weight_func differentiation differentiator last_window_ index_type_ index_freq_ training_range_ exog_in_ exog_names_in_ exog_type_in_ exog_dtypes_in_ X_train_window_features_names_out_ X_train_exog_names_out_ X_train_features_names_out_ in_sample_residuals_ out_sample_residuals_ in_sample_residuals_by_bin_ out_sample_residuals_by_bin_ creation_date is_fitted fit_date skforecast_version python_version forecaster_id lags lags_names max_lag window_features window_features_names max_size_window_features window_size window_features_class_names binner_kwargs binner binner_intervals_ fit_kwargs
Regressor¶
Skforecast is a Python library that facilitates using scikit-learn
regressors as multi-step forecasters and also works with any regressor compatible with the scikit-learn API.
# Forecaster regressor
# ==============================================================================
forecaster.regressor
RandomForestRegressor(random_state=123)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
RandomForestRegressor(random_state=123)
# Show regressor parameters
# ==============================================================================
forecaster.regressor.get_params(deep=True)
{'bootstrap': True, 'ccp_alpha': 0.0, 'criterion': 'squared_error', 'max_depth': None, 'max_features': 1.0, 'max_leaf_nodes': None, 'max_samples': None, 'min_impurity_decrease': 0.0, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'monotonic_cst': None, 'n_estimators': 100, 'n_jobs': None, 'oob_score': False, 'random_state': 123, 'verbose': 0, 'warm_start': False}
✎ Note
In the forecasters that follows a Direct Strategy (ForecasterDirect
and ForecasterDirectMultiVariate
), one instance of the regressor is trained for each step. All of them are stored in self.regressors_
Lags¶
Lags used as predictors. Index starts at 1, so lag 1 is equal to t-1.
# Forecaster lags
# ==============================================================================
forecaster.lags
array([1, 2, 3, 4, 5])
# Lags information
# ==============================================================================
print("Lags names : ", forecaster.lags_names)
print("Max lag : ", forecaster.max_lag)
Lags names : ['lag_1', 'lag_2', 'lag_3', 'lag_4', 'lag_5'] Max lag : 5
Window features¶
When forecasting time series data, it may be useful to consider additional characteristics beyond just the lagged values. For example, the moving average of the previous n values may help to capture the trend in the series. The window_features
argument allows the inclusion of additional predictors created with the previous values of the series.
# Forecaster window features
# ==============================================================================
forecaster.window_features
[RollingFeatures( stats = ['mean'], window_sizes = [7], Max window size = 7, min_periods = [7], features_names = ['roll_mean_7'], fillna = None )]
# Window features information
# ==============================================================================
print("Window features names : ", forecaster.window_features_names)
print("Max window size wf : ", forecaster.max_size_window_features)
print("Window features classes : ", forecaster.window_features_class_names)
Window features names : ['roll_mean_7'] Max window size wf : 7 Window features classes : ['RollingFeatures']
Window size¶
The size of the data window needed to create the predictors. It is the maximum between the maximum lag and the maximum window required by the window features.
# Forecaster window size
# ==============================================================================
print("Max lag : ", forecaster.max_lag)
print("Max window size wf : ", forecaster.max_size_window_features)
print("Window size : ", forecaster.window_size)
Max lag : 5 Max window size wf : 7 Window size : 7
Last window¶
Last window the forecaster has seen during training. It stores the values needed to predict the next step
immediately after the training data.
# Forecaster last window
# ==============================================================================
forecaster.last_window_
y | |
---|---|
datetime | |
2007-12-01 | 1.176589 |
2008-01-01 | 1.219941 |
2008-02-01 | 0.761822 |
2008-03-01 | 0.649435 |
2008-04-01 | 0.827887 |
2008-05-01 | 0.816255 |
2008-06-01 | 0.762137 |
💡 Tip
Learn how to get your forecasters into production and get the most out of them with last_window
. Using forecasting models in production.
In-sample residuals¶
Residuals from models predicting training data. If transformer_series
is not None
, the residuals are stored in the transformed scale.
✎ Note
In the forecasters that follows a Direct Strategy (ForecasterDirect
and ForecasterDirectMultiVariate
) and in the
Global Forecasting Models: Independent multi-series forecasting (ForecasterRecursiveMultiSeries
) this parameter is a dict
containing the residuals for each regressor/serie.
# Forecaster in-sample residuals
# ==============================================================================
print("Length:", len(forecaster.in_sample_residuals_))
forecaster.in_sample_residuals_[:5]
Length: 197
array([-0.13137208, -0.03279934, -0.00417238, -0.02469213, 0.00563615])
Out-of-sample residuals¶
Residuals from models predicting non training data. If transformer_y
is not None
, residuals are assumed to be in the transformed scale. Use set_out_sample_residuals
method to set values.
As no values have been added, the parameter is None
.
# Forecaster out-of-sample residuals
# ==============================================================================
forecaster.out_sample_residuals_