Save and load forecasters¶
Skforecast models can be easily saved and loaded from disk using the pickle or joblib library. Two handy functions, save_forecaster
and load_forecaster
are available to streamline this process. See below for a simple example.
A forecaster_id
has been included when initializing the Forecaster, this may help to identify the target of the model.
Note
Learn how to use forecaster models in production.
Libraries¶
# Libraries
# ==============================================================================
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from skforecast.ForecasterAutoreg import ForecasterAutoreg
from skforecast.ForecasterAutoregCustom import ForecasterAutoregCustom
from skforecast.utils import save_forecaster
from skforecast.utils import load_forecaster
from skforecast.datasets import fetch_dataset
Data¶
# Download data
# ==============================================================================
data = fetch_dataset(
name="h2o", raw=True, kwargs_read_csv={"names": ["y", "date"], "header": 0}
)
data['date'] = pd.to_datetime(data['date'], format='%Y-%m-%d')
data = data.set_index('date')
data = data.asfreq('MS')
h2o --- Monthly expenditure ($AUD) on corticosteroid drugs that the Australian health system had between 1991 and 2008. Hyndman R (2023). fpp3: Data for Forecasting: Principles and Practice(3rd Edition). http://pkg.robjhyndman.com/fpp3package/,https://github.com/robjhyndman /fpp3package, http://OTexts.com/fpp3. Shape of the dataset: (204, 2)
Save and load forecaster model¶
# Create and train forecaster
# ==============================================================================
forecaster = ForecasterAutoreg(
regressor = RandomForestRegressor(random_state=123),
lags = 5,
forecaster_id = "forecaster_001"
)
forecaster.fit(y=data['y'])
forecaster.predict(steps=3)
2008-07-01 0.714526 2008-08-01 0.789144 2008-09-01 0.818433 Freq: MS, Name: pred, dtype: float64
# Save model
# ==============================================================================
save_forecaster(forecaster, file_name='forecaster_001.joblib', verbose=False)
# Load model
# ==============================================================================
forecaster_loaded = load_forecaster('forecaster_001.joblib', verbose=True)
================= ForecasterAutoreg ================= Regressor: RandomForestRegressor(random_state=123) Lags: [1 2 3 4 5] Transformer for y: None Transformer for exog: None Window size: 5 Weight function included: False Differentiation order: None Exogenous included: False Type of exogenous variable: None Exogenous variables names: None Training range: [Timestamp('1991-07-01 00:00:00'), Timestamp('2008-06-01 00:00:00')] Training index type: DatetimeIndex Training index frequency: MS Regressor parameters: {'bootstrap': True, 'ccp_alpha': 0.0, 'criterion': 'squared_error', 'max_depth': None, 'max_features': 1.0, 'max_leaf_nodes': None, 'max_samples': None, 'min_impurity_decrease': 0.0, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'monotonic_cst': None, 'n_estimators': 100, 'n_jobs': None, 'oob_score': False, 'random_state': 123, 'verbose': 0, 'warm_start': False} fit_kwargs: {} Creation date: 2024-05-15 14:17:25 Last fit date: 2024-05-15 14:17:25 Skforecast version: 0.12.0 Python version: 3.11.8 Forecaster id: forecaster_001
# Predict
# ==============================================================================
forecaster_loaded.predict(steps=3)
2008-07-01 0.714526 2008-08-01 0.789144 2008-09-01 0.818433 Freq: MS, Name: pred, dtype: float64
# Forecaster identifier
# ==============================================================================
forecaster.forecaster_id
'forecaster_001'
Saving and Loading a Forecaster Model with Custom Functions¶
Sometimes external functions are needed when creating a Forecaster object. For example:
A custom function to create predictors.
A function to reduce the impact of some dates on the model, Weighted Time Series Forecasting.
For your code to work properly, these functions must be available in the environment where the Forecaster is loaded.
# Custom function to create predictors
# ==============================================================================
def create_predictors(y):
"""
Create first 5 lags of a time series.
"""
lags = y[-1:-6:-1]
return lags
# Create and train forecaster
# ==============================================================================
forecaster = ForecasterAutoregCustom(
regressor = RandomForestRegressor(random_state=123),
fun_predictors = create_predictors,
window_size = 5
)
forecaster.fit(y=data['y'])
The save_forecaster
function will save all the functions used to create it in different files. In this case, the created module is create_predictors.py
.
# Save model and custom function
# ==============================================================================
save_forecaster(forecaster, file_name='forecaster_custom.joblib',
save_custom_functions=True, verbose=False)
These functions must be imported into the environment where the Forecaster is going to be loaded.
# Load model and custom function
# ==============================================================================
from create_predictors import create_predictors
forecaster_loaded = load_forecaster('forecaster_custom.joblib', verbose=True)
======================= ForecasterAutoregCustom ======================= Regressor: RandomForestRegressor(random_state=123) Predictors created with function: create_predictors Transformer for y: None Transformer for exog: None Window size: 5 Weight function included: False Differentiation order: None Exogenous included: False Type of exogenous variable: None Exogenous variables names: None Training range: [Timestamp('1991-07-01 00:00:00'), Timestamp('2008-06-01 00:00:00')] Training index type: DatetimeIndex Training index frequency: MS Regressor parameters: {'bootstrap': True, 'ccp_alpha': 0.0, 'criterion': 'squared_error', 'max_depth': None, 'max_features': 1.0, 'max_leaf_nodes': None, 'max_samples': None, 'min_impurity_decrease': 0.0, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'monotonic_cst': None, 'n_estimators': 100, 'n_jobs': None, 'oob_score': False, 'random_state': 123, 'verbose': 0, 'warm_start': False} fit_kwargs: {} Creation date: 2024-05-15 14:17:25 Last fit date: 2024-05-15 14:17:25 Skforecast version: 0.12.0 Python version: 3.11.8 Forecaster id: None