Hyperparameter tuning and lags selection¶

Hyperparameter tuning is a crucial aspect of developing accurate and effective machine learning models. In machine learning, hyperparameters are values that cannot be learned from data and must be set by the user before the model is trained. These hyperparameters can significantly impact the performance of the model, and tuning them carefully can improve its accuracy and generalization to new data. In the case of forecasting models, the lags included in the model can be considered as an additional hyperparameter.

Hyperparameter tuning involves systematically testing different values or combinations of hyperparameters (including lags) to find the optimal configuration that produces the best results. The skforecast library offers various hyperparameter tuning strategies, including grid search, random search, and Bayesian search, that can be combined with backtesting or one-step-ahead validation to identify the optimal combination of lags and hyperparameters that achieve the best prediction performance.

💡 Tip

The computational cost of hyperparameter tuning depends heavily on the backtesting approach chosen to evaluate each hyperparameter combination. In general, the duration of the tuning process increases with the number of re-trains involved in the backtesting.

To effectively speed up the prototyping phase, it is highly recommended to adopt a two-step strategy. First, use refit=False during the initial search to narrow down the range of values. Then, focus on the identified region of interest and apply a tailored backtesting strategy that meets the specific requirements of the use case.

For additional tips on backtesting strategies, refer to the following resource: Which backtesting strategy should I use?.

✎ Note

All backtesting and grid search functions have been extended to include the n_jobs argument, allowing multi-process parallelization for improved performance. This applies to all functions of the different model_selection modules.

The benefits of parallelization depend on several factors, including the regressor used, the number of fits to be performed, and the volume of data involved. When the n_jobs parameter is set to 'auto', the level of parallelization is automatically selected based on heuristic rules that aim to choose the best option for each scenario.

For a more detailed look at parallelization, visit Parallelization in skforecast.

Libraries and data¶

In [1]:

Copied!





# Libraries
# ==============================================================================
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from lightgbm import LGBMRegressor
from sklearn.metrics import mean_squared_error

from skforecast.datasets import fetch_dataset
from skforecast.recursive import ForecasterRecursive
from skforecast.model_selection import TimeSeriesFold
from skforecast.model_selection import OneStepAheadFold
from skforecast.model_selection import grid_search_forecaster
from skforecast.model_selection import random_search_forecaster
from skforecast.model_selection import bayesian_search_forecaster
# Libraries
# ==============================================================================
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from lightgbm import LGBMRegressor
from sklearn.metrics import mean_squared_error

from skforecast.datasets import fetch_dataset
from skforecast.recursive import ForecasterRecursive
from skforecast.model_selection import TimeSeriesFold
from skforecast.model_selection import OneStepAheadFold
from skforecast.model_selection import grid_search_forecaster
from skforecast.model_selection import random_search_forecaster
from skforecast.model_selection import bayesian_search_forecaster

In [2]:

Copied!





# Download data
# ==============================================================================
data = fetch_dataset(
    name="h2o", raw=True, kwargs_read_csv={"names": ["y", "datetime"], "header": 0}
)

# Data preprocessing
# ==============================================================================
data['datetime'] = pd.to_datetime(data['datetime'], format='%Y-%m-%d')
data = data.set_index('datetime')
data = data.asfreq('MS')
data = data[['y']]
data = data.sort_index()

# Train-val-test dates
# ==============================================================================
end_train = '2001-01-01 23:59:00'
end_val = '2006-01-01 23:59:00'

print(
    f"Train dates      : {data.index.min()} --- {data.loc[:end_train].index.max()}"
    f"  (n={len(data.loc[:end_train])})"
)
print(
    f"Validation dates : {data.loc[end_train:].index.min()} --- {data.loc[:end_val].index.max()}"
    f"  (n={len(data.loc[end_train:end_val])})"
)
print(
    f"Test dates       : {data.loc[end_val:].index.min()} --- {data.index.max()}"
    f" (n={len(data.loc[end_val:])})"
)

# Plot
# ==============================================================================
fig, ax = plt.subplots(figsize=(7, 3))
data.loc[:end_train].plot(ax=ax, label='train')
data.loc[end_train:end_val].plot(ax=ax, label='validation')
data.loc[end_val:].plot(ax=ax, label='test')
ax.legend();
# Download data
# ==============================================================================
data = fetch_dataset(
    name="h2o", raw=True, kwargs_read_csv={"names": ["y", "datetime"], "header": 0}
)

# Data preprocessing
# ==============================================================================
data['datetime'] = pd.to_datetime(data['datetime'], format='%Y-%m-%d')
data = data.set_index('datetime')
data = data.asfreq('MS')
data = data[['y']]
data = data.sort_index()

# Train-val-test dates
# ==============================================================================
end_train = '2001-01-01 23:59:00'
end_val = '2006-01-01 23:59:00'

print(
    f"Train dates      : {data.index.min()} --- {data.loc[:end_train].index.max()}"
    f"  (n={len(data.loc[:end_train])})"
)
print(
    f"Validation dates : {data.loc[end_train:].index.min()} --- {data.loc[:end_val].index.max()}"
    f"  (n={len(data.loc[end_train:end_val])})"
)
print(
    f"Test dates       : {data.loc[end_val:].index.min()} --- {data.index.max()}"
    f" (n={len(data.loc[end_val:])})"
)

# Plot
# ==============================================================================
fig, ax = plt.subplots(figsize=(7, 3))
data.loc[:end_train].plot(ax=ax, label='train')
data.loc[end_train:end_val].plot(ax=ax, label='validation')
data.loc[end_val:].plot(ax=ax, label='test')
ax.legend();

h2o
---
Monthly expenditure ($AUD) on corticosteroid drugs that the Australian health
system had between 1991 and 2008.
Hyndman R (2023). fpp3: Data for Forecasting: Principles and Practice(3rd
Edition). http://pkg.robjhyndman.com/fpp3package/,https://github.com/robjhyndman
/fpp3package, http://OTexts.com/fpp3.
Shape of the dataset: (204, 2)
Train dates      : 1991-07-01 00:00:00 --- 2001-01-01 00:00:00  (n=115)
Validation dates : 2001-02-01 00:00:00 --- 2006-01-01 00:00:00  (n=60)
Test dates       : 2006-02-01 00:00:00 --- 2008-06-01 00:00:00 (n=29)

No description has been provided for this image

Grid search¶

Grid search is a popular hyperparameter tuning technique that evaluate an exaustive list of combinations of hyperparameters and lags to find the optimal configuration for a forecasting model. To perform a grid search with the Skforecast library, two grids are needed: one with different lags (lags_grid) and another with the hyperparameters (param_grid).

The grid search process involves the following steps:

grid_search_forecaster replaces the lags argument with the first option appearing in lags_grid.
The function validates all combinations of hyperparameters presented in param_grid using backtesting.
The function repeats these two steps until it has evaluated all possible combinations of lags and hyperparameters.
If return_best = True, the original forecaster is trained with the best lags and hyperparameters configuration found during the grid search process.

In [3]:

Copied!





# Grid search hyperparameters and lags
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = LGBMRegressor(random_state=123, verbose=-1),
                 lags      = 10  # Placeholder, the value will be overwritten
             )

# Lags used as predictors
lags_grid = {
    'lags_1': 3,
    'lags_2': 10,
    'lags_3': [1, 2, 3, 20]
}

# Regressor hyperparameters
param_grid = {
    'n_estimators': [50, 100],
    'max_depth': [5, 10, 15]
}

# Folds
cv = TimeSeriesFold(
         steps              = 12,
         initial_train_size = len(data.loc[:end_train]),
         refit              = False
     )

results = grid_search_forecaster(
              forecaster    = forecaster,
              y             = data.loc[:end_val, 'y'],
              param_grid    = param_grid,
              lags_grid     = lags_grid,
              cv            = cv,
              metric        = 'mean_squared_error',
              return_best   = True,
              n_jobs        = 'auto',
              verbose       = False,
              show_progress = True
          )
results
# Grid search hyperparameters and lags
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = LGBMRegressor(random_state=123, verbose=-1),
                 lags      = 10  # Placeholder, the value will be overwritten
             )

# Lags used as predictors
lags_grid = {
    'lags_1': 3,
    'lags_2': 10,
    'lags_3': [1, 2, 3, 20]
}

# Regressor hyperparameters
param_grid = {
    'n_estimators': [50, 100],
    'max_depth': [5, 10, 15]
}

# Folds
cv = TimeSeriesFold(
         steps              = 12,
         initial_train_size = len(data.loc[:end_train]),
         refit              = False
     )

results = grid_search_forecaster(
              forecaster    = forecaster,
              y             = data.loc[:end_val, 'y'],
              param_grid    = param_grid,
              lags_grid     = lags_grid,
              cv            = cv,
              metric        = 'mean_squared_error',
              return_best   = True,
              n_jobs        = 'auto',
              verbose       = False,
              show_progress = True
          )
results

lags grid:   0%|          | 0/3 [00:00<?, ?it/s]

params grid:   0%|          | 0/6 [00:00<?, ?it/s]

`Forecaster` refitted using the best-found lags and parameters, and the whole data set: 
  Lags: [1 2 3] 
  Parameters: {'max_depth': 5, 'n_estimators': 100}
  Backtesting metric: 0.04387531272712768

Out[3]:

	lags	lags_label	params	mean_squared_error	max_depth	n_estimators
0	[1, 2, 3]	lags_1	{'max_depth': 5, 'n_estimators': 100}	0.043875	5	100
1	[1, 2, 3]	lags_1	{'max_depth': 10, 'n_estimators': 100}	0.043875	10	100
2	[1, 2, 3]	lags_1	{'max_depth': 15, 'n_estimators': 100}	0.043875	15	100
3	[1, 2, 3, 20]	lags_3	{'max_depth': 15, 'n_estimators': 100}	0.044074	15	100
4	[1, 2, 3, 20]	lags_3	{'max_depth': 10, 'n_estimators': 100}	0.044074	10	100
5	[1, 2, 3, 20]	lags_3	{'max_depth': 5, 'n_estimators': 100}	0.044074	5	100
6	[1, 2, 3]	lags_1	{'max_depth': 5, 'n_estimators': 50}	0.045423	5	50
7	[1, 2, 3]	lags_1	{'max_depth': 15, 'n_estimators': 50}	0.045423	15	50
8	[1, 2, 3]	lags_1	{'max_depth': 10, 'n_estimators': 50}	0.045423	10	50
9	[1, 2, 3, 20]	lags_3	{'max_depth': 15, 'n_estimators': 50}	0.046221	15	50
10	[1, 2, 3, 20]	lags_3	{'max_depth': 5, 'n_estimators': 50}	0.046221	5	50
11	[1, 2, 3, 20]	lags_3	{'max_depth': 10, 'n_estimators': 50}	0.046221	10	50
12	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	lags_2	{'max_depth': 5, 'n_estimators': 100}	0.047896	5	100
13	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	lags_2	{'max_depth': 10, 'n_estimators': 100}	0.047896	10	100
14	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	lags_2	{'max_depth': 15, 'n_estimators': 100}	0.047896	15	100
15	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	lags_2	{'max_depth': 15, 'n_estimators': 50}	0.051399	15	50
16	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	lags_2	{'max_depth': 5, 'n_estimators': 50}	0.051399	5	50
17	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	lags_2	{'max_depth': 10, 'n_estimators': 50}	0.051399	10	50

Since return_best = True, the forecaster object is updated with the best configuration found and trained with the whole data set. This means that the final model obtained from grid search will have the best combination of lags and hyperparameters that resulted in the highest performance metric. This final model can then be used for future predictions on new data.

In [4]:

Copied!

forecaster
forecaster

Out[4]:

ForecasterRecursive

General Information

Regressor: LGBMRegressor
Lags: [1 2 3]
Window features: None
Window size: 3
Exogenous included: False
Weight function included: False
Differentiation order: None
Creation date: 2024-11-10 16:55:59
Last fit date: 2024-11-10 16:56:01
Skforecast version: 0.14.0
Python version: 3.11.10
Forecaster id: None

Exogenous Variables

None

Data Transformations

Transformer for y: None
Transformer for exog: None

Training Information

Training range: [Timestamp('1991-07-01 00:00:00'), Timestamp('2006-01-01 00:00:00')]
Training index type: DatetimeIndex
Training index frequency: MS

Regressor Parameters

{'boosting_type': 'gbdt', 'class_weight': None, 'colsample_bytree': 1.0, 'importance_type': 'split', 'learning_rate': 0.1, 'max_depth': 5, 'min_child_samples': 20, 'min_child_weight': 0.001, 'min_split_gain': 0.0, 'n_estimators': 100, 'n_jobs': None, 'num_leaves': 31, 'objective': None, 'random_state': 123, 'reg_alpha': 0.0, 'reg_lambda': 0.0, 'subsample': 1.0, 'subsample_for_bin': 200000, 'subsample_freq': 0, 'verbose': -1}

Fit Kwargs

{}

🛈 API Reference 🗎 User Guide

Random search¶

Random search is another hyperparameter tuning strategy available in the Skforecast library. In contrast to grid search, which tries out all possible combinations of hyperparameters and lags, randomized search samples a fixed number of values from the specified possibilities. The number of combinations that are evaluated is given by n_iter.

It is important to note that random sampling is only applied to the model hyperparameters, but not to the lags. All lags specified by the user are evaluated.

In [5]:

Copied!





# Random search hyperparameters and lags
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = LGBMRegressor(random_state=123, verbose=-1),
                 lags      = 10  # Placeholder, the value will be overwritten
             )

# Lags used as predictors
lags_grid = [3, 5]

# Regressor hyperparameters
param_distributions = {
    'n_estimators': np.arange(start=10, stop=100, step=1, dtype=int),
    'max_depth': np.arange(start=5, stop=30, step=1, dtype=int)
}

# Folds
cv = TimeSeriesFold(
         steps              = 12,
         initial_train_size = len(data.loc[:end_train]),
         refit              = False,
     )

results = random_search_forecaster(
              forecaster          = forecaster,
              y                   = data.loc[:end_val, 'y'],
              lags_grid           = lags_grid,
              param_distributions = param_distributions,
              cv                  = cv,
              n_iter              = 5,
              metric              = 'mean_squared_error',
              return_best         = True,
              random_state        = 123,
              n_jobs              = 'auto',
              verbose             = False,
              show_progress       = True
          )
results.head(4)
# Random search hyperparameters and lags
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = LGBMRegressor(random_state=123, verbose=-1),
                 lags      = 10  # Placeholder, the value will be overwritten
             )

# Lags used as predictors
lags_grid = [3, 5]

# Regressor hyperparameters
param_distributions = {
    'n_estimators': np.arange(start=10, stop=100, step=1, dtype=int),
    'max_depth': np.arange(start=5, stop=30, step=1, dtype=int)
}

# Folds
cv = TimeSeriesFold(
         steps              = 12,
         initial_train_size = len(data.loc[:end_train]),
         refit              = False,
     )

results = random_search_forecaster(
              forecaster          = forecaster,
              y                   = data.loc[:end_val, 'y'],
              lags_grid           = lags_grid,
              param_distributions = param_distributions,
              cv                  = cv,
              n_iter              = 5,
              metric              = 'mean_squared_error',
              return_best         = True,
              random_state        = 123,
              n_jobs              = 'auto',
              verbose             = False,
              show_progress       = True
          )
results.head(4)

lags grid:   0%|          | 0/2 [00:00<?, ?it/s]

params grid:   0%|          | 0/5 [00:00<?, ?it/s]

`Forecaster` refitted using the best-found lags and parameters, and the whole data set: 
  Lags: [1 2 3 4 5] 
  Parameters: {'n_estimators': 96, 'max_depth': 19}
  Backtesting metric: 0.04313147793349785

Out[5]:

	lags	lags_label	params	mean_squared_error	n_estimators	max_depth
0	[1, 2, 3, 4, 5]	[1, 2, 3, 4, 5]	{'n_estimators': 96, 'max_depth': 19}	0.043131	96	19
1	[1, 2, 3, 4, 5]	[1, 2, 3, 4, 5]	{'n_estimators': 94, 'max_depth': 28}	0.043171	94	28
2	[1, 2, 3, 4, 5]	[1, 2, 3, 4, 5]	{'n_estimators': 77, 'max_depth': 17}	0.043663	77	17
3	[1, 2, 3]	[1, 2, 3]	{'n_estimators': 96, 'max_depth': 19}	0.043868	96	19

Bayesian search¶

Grid and random search can generate good results, especially when the search range is narrowed down. However, neither of them takes into account the results obtained so far, which prevents them from focusing the search on the regions of greatest interest while avoiding unnecessary ones.

An alternative is to use Bayesian optimization methods to search for hyperparameters. In general terms, bayesian hyperparameter optimization consists of creating a probabilistic model in which the objective function is the model validation metric (RMSE, AUC, accuracy...). With this strategy, the search is redirected at each iteration to the regions of greatest interest. The ultimate goal is to reduce the number of hyperparameter combinations with which the model is evaluated, choosing only the best candidates. This approach is particularly advantageous when the search space is very large or the model evaluation is very slow.

⚠ Warning

lags_grid is no longer required when using bayesian_search_forecaster since skforecast 0.12.0. The lags argument is now included in the search_space. This allows the lags to be optimized along with the other hyperparameters of the regressor in the bayesian search.

In skforecast, Bayesian optimization with Optuna is performed using its Study object. The objective of the optimization is to minimize the metric generated by backtesting.

Additional parameters can be included by passing a dictionary to kwargs_create_study and kwargs_study_optimize arguments to create_study and optimize method, respectively. These arguments are used to configure the study object and optimization algorithm.

To use Optuna in skforecast, the search_space argument must be a python function that defines the hyperparameters to optimize over. Optuna uses the Trial object object to generate each search space.

In [6]:

Copied!





# Bayesian search hyperparameters and lags with Optuna
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = LGBMRegressor(random_state=123, verbose=-1),
                 lags      = 10  # Placeholder, the value will be overwritten
             )


# Search space
def search_space(trial):
    search_space  = {
        'lags'            : trial.suggest_categorical('lags', [3, 5]),
        'n_estimators'    : trial.suggest_int('n_estimators', 10, 20),
        'min_samples_leaf': trial.suggest_int('min_samples_leaf', 1, 10),
        'max_features'    : trial.suggest_categorical('max_features', ['log2', 'sqrt'])
    }
    
    return search_space


# Folds
cv = TimeSeriesFold(
         steps              = 12,
         initial_train_size = len(data.loc[:end_train]),
         refit              = False,
     )

results, best_trial = bayesian_search_forecaster(
                          forecaster            = forecaster,
                          y                     = data.loc[:end_val, 'y'],
                          search_space          = search_space,
                          cv                    = cv,
                          metric                = 'mean_absolute_error',
                          n_trials              = 10,
                          random_state          = 123,
                          return_best           = False,
                          n_jobs                = 'auto',
                          verbose               = False,
                          show_progress         = True,
                          kwargs_create_study   = {},
                          kwargs_study_optimize = {}
                      )
results.head(4)
# Bayesian search hyperparameters and lags with Optuna
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = LGBMRegressor(random_state=123, verbose=-1),
                 lags      = 10  # Placeholder, the value will be overwritten
             )


# Search space
def search_space(trial):
    search_space  = {
        'lags'            : trial.suggest_categorical('lags', [3, 5]),
        'n_estimators'    : trial.suggest_int('n_estimators', 10, 20),
        'min_samples_leaf': trial.suggest_int('min_samples_leaf', 1, 10),
        'max_features'    : trial.suggest_categorical('max_features', ['log2', 'sqrt'])
    }
    
    return search_space


# Folds
cv = TimeSeriesFold(
         steps              = 12,
         initial_train_size = len(data.loc[:end_train]),
         refit              = False,
     )

results, best_trial = bayesian_search_forecaster(
                          forecaster            = forecaster,
                          y                     = data.loc[:end_val, 'y'],
                          search_space          = search_space,
                          cv                    = cv,
                          metric                = 'mean_absolute_error',
                          n_trials              = 10,
                          random_state          = 123,
                          return_best           = False,
                          n_jobs                = 'auto',
                          verbose               = False,
                          show_progress         = True,
                          kwargs_create_study   = {},
                          kwargs_study_optimize = {}
                      )
results.head(4)

  0%|          | 0/10 [00:00<?, ?it/s]

Out[6]:

	lags	params	mean_absolute_error	n_estimators	min_samples_leaf	max_features
0	[1, 2, 3, 4, 5]	{'n_estimators': 19, 'min_samples_leaf': 3, 'm...	0.126995	19	3	sqrt
1	[1, 2, 3]	{'n_estimators': 15, 'min_samples_leaf': 4, 'm...	0.153278	15	4	sqrt
2	[1, 2, 3]	{'n_estimators': 13, 'min_samples_leaf': 3, 'm...	0.160396	13	3	sqrt
3	[1, 2, 3, 4, 5]	{'n_estimators': 14, 'min_samples_leaf': 5, 'm...	0.172366	14	5	log2

best_trial contains information of the trial which achived the best results. See more in Study class.

In [7]:

Copied!

# Optuna best trial in the study
# ==============================================================================
best_trial
# Optuna best trial in the study
# ==============================================================================
best_trial

Out[7]:

FrozenTrial(number=7, state=1, values=[0.1269945910624239], datetime_start=datetime.datetime(2024, 11, 10, 16, 56, 2, 465549), datetime_complete=datetime.datetime(2024, 11, 10, 16, 56, 2, 526349), params={'lags': 5, 'n_estimators': 19, 'min_samples_leaf': 3, 'max_features': 'sqrt'}, user_attrs={}, system_attrs={}, intermediate_values={}, distributions={'lags': CategoricalDistribution(choices=(3, 5)), 'n_estimators': IntDistribution(high=20, log=False, low=10, step=1), 'min_samples_leaf': IntDistribution(high=10, log=False, low=1, step=1), 'max_features': CategoricalDistribution(choices=('log2', 'sqrt'))}, trial_id=7, value=None)

One-step-ahead validation¶

Hyperparameter and lag tuning involves systematically testing different values or combinations of hyperparameters (and/or lags) to find the optimal configuration that gives the best performance. The skforecast library provides two different methods to evaluate each candidate configuration:

Backtesting: In this method, the model predicts several steps ahead in each iteration, using the same forecast horizon and retraining frequency strategy that would be used if the model were deployed. This simulates a real forecasting scenario where the model is retrained and updated over time. More information here.
One-Step Ahead: Evaluates the model using only one-step-ahead predictions. This method is faster because it requires fewer iterations, but it only tests the model's performance in the immediate next time step ( $t+1$ ).

Each method uses a different evaluation strategy, so they may produce different results. However, in the long run, both methods are expected to converge to similar selections of optimal hyperparameters. The one-step-ahead method is much faster than backtesting because it requires fewer iterations, but it only tests the model's performance in the immediate next time step. It is recommended to backtest the final model for a more accurate multi-step performance estimate.

💡 Tip

For a more detailed comparison of the results (execution time and metric) obtained with each strategy, visit Hyperparameters and lags search: backtesting vs one-step-ahead.

In [8]:

Copied!





# Bayesian search with OneStepAheadFold
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = LGBMRegressor(random_state=123, verbose=-1),
                 lags      = 10  # Placeholder, the value will be overwritten
             )


# Search space
def search_space(trial):
    search_space  = {
        'lags'            : trial.suggest_categorical('lags', [3, 5]),
        'n_estimators'    : trial.suggest_int('n_estimators', 10, 20),
        'min_samples_leaf': trial.suggest_int('min_samples_leaf', 1, 10),
        'max_features'    : trial.suggest_categorical('max_features', ['log2', 'sqrt'])
    }
    
    return search_space


# Folds
cv = OneStepAheadFold(initial_train_size = len(data.loc[:end_train]))

results, best_trial = bayesian_search_forecaster(
                          forecaster            = forecaster,
                          y                     = data.loc[:end_val, 'y'],
                          search_space          = search_space,
                          cv                    = cv,
                          metric                = 'mean_absolute_error',
                          n_trials              = 10,
                          random_state          = 123,
                          return_best           = False,
                          n_jobs                = 'auto',
                          verbose               = False,
                          show_progress         = True,
                          kwargs_create_study   = {},
                          kwargs_study_optimize = {}
                      )
results.head(4)
# Bayesian search with OneStepAheadFold
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = LGBMRegressor(random_state=123, verbose=-1),
                 lags      = 10  # Placeholder, the value will be overwritten
             )


# Search space
def search_space(trial):
    search_space  = {
        'lags'            : trial.suggest_categorical('lags', [3, 5]),
        'n_estimators'    : trial.suggest_int('n_estimators', 10, 20),
        'min_samples_leaf': trial.suggest_int('min_samples_leaf', 1, 10),
        'max_features'    : trial.suggest_categorical('max_features', ['log2', 'sqrt'])
    }
    
    return search_space


# Folds
cv = OneStepAheadFold(initial_train_size = len(data.loc[:end_train]))

results, best_trial = bayesian_search_forecaster(
                          forecaster            = forecaster,
                          y                     = data.loc[:end_val, 'y'],
                          search_space          = search_space,
                          cv                    = cv,
                          metric                = 'mean_absolute_error',
                          n_trials              = 10,
                          random_state          = 123,
                          return_best           = False,
                          n_jobs                = 'auto',
                          verbose               = False,
                          show_progress         = True,
                          kwargs_create_study   = {},
                          kwargs_study_optimize = {}
                      )
results.head(4)

c:\Users\jaesc2\Miniconda3\envs\skforecast_py11_2\Lib\site-packages\skforecast\model_selection\_search.py:715: OneStepAheadValidationWarning: One-step-ahead predictions are used for faster model comparison, but they may not fully represent multi-step prediction performance. It is recommended to backtest the final model for a more accurate multi-step performance estimate. 
 You can suppress this warning using: warnings.simplefilter('ignore', category=OneStepAheadValidationWarning)
  warnings.warn(

  0%|          | 0/10 [00:00<?, ?it/s]

Out[8]:

	lags	params	mean_absolute_error	n_estimators	min_samples_leaf	max_features
0	[1, 2, 3, 4, 5]	{'n_estimators': 20, 'min_samples_leaf': 6, 'm...	0.180137	20	6	log2
1	[1, 2, 3, 4, 5]	{'n_estimators': 14, 'min_samples_leaf': 5, 'm...	0.180815	14	5	log2
2	[1, 2, 3, 4, 5]	{'n_estimators': 16, 'min_samples_leaf': 9, 'm...	0.187584	16	9	log2
3	[1, 2, 3]	{'n_estimators': 14, 'min_samples_leaf': 7, 'm...	0.188359	14	7	log2

Hyperparameter tuning with custom metric¶

Besides to the commonly used metrics such as mean_squared_error, mean_absolute_error, and mean_absolute_percentage_error, users have the flexibility to define their own custom metric function, provided that it includes the arguments y_true (the true values of the series) and y_pred (the predicted values), and returns a numeric value (either a float or an int).

This customizability enables users to evaluate the model's predictive performance in a wide range of scenarios, such as considering only certain months, days, non holiday; or focusing only on the last step of the predicted horizon.

To illustrate this, consider the following example: a 12-month horizon is forecasted, but the interest metric is calculated by considering only the last three months of each year. This is achieved by defining a custom metric function that takes into account only the relevant months, which is then passed as an argument to the backtesting function.

The example below demonstrates how to use hyperparameter optimization to find the optimal parameters for a custom metric that considers only the last three months of each year.

In [9]:

Copied!





# Custom metric
# ==============================================================================
def custom_metric(y_true, y_pred):
    """
    Calculate the mean squared error using only the predicted values of the last
    3 months of the year.
    """
    mask = y_true.index.month.isin([10, 11, 12])
    metric = mean_squared_error(y_true[mask], y_pred[mask])
    
    return metric
# Custom metric
# ==============================================================================
def custom_metric(y_true, y_pred):
    """
    Calculate the mean squared error using only the predicted values of the last
    3 months of the year.
    """
    mask = y_true.index.month.isin([10, 11, 12])
    metric = mean_squared_error(y_true[mask], y_pred[mask])
    
    return metric

In [10]:

Copied!





# Grid search hyperparameter and lags with custom metric
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = LGBMRegressor(random_state=123, verbose=-1),
                 lags      = 10  # Placeholder, the value will be overwritten
             )

# Lags used as predictors
lags_grid = [3, 10, [1, 2, 3, 20]]

# Regressor hyperparameters
param_grid = {
    'n_estimators': [50, 100],
    'max_depth': [5, 10, 15]
}

# Folds
cv = TimeSeriesFold(
         steps              = 12,
         initial_train_size = len(data.loc[:end_train]),
         refit              = False,
     )

results = grid_search_forecaster(
              forecaster    = forecaster,
              y             = data.loc[:end_val, 'y'],
              cv            = cv,
              param_grid    = param_grid,
              lags_grid     = lags_grid,
              metric        = custom_metric,
              return_best   = True,
              n_jobs        = 'auto',
              verbose       = False,
              show_progress = True
          )

results.head(4)
# Grid search hyperparameter and lags with custom metric
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = LGBMRegressor(random_state=123, verbose=-1),
                 lags      = 10  # Placeholder, the value will be overwritten
             )

# Lags used as predictors
lags_grid = [3, 10, [1, 2, 3, 20]]

# Regressor hyperparameters
param_grid = {
    'n_estimators': [50, 100],
    'max_depth': [5, 10, 15]
}

# Folds
cv = TimeSeriesFold(
         steps              = 12,
         initial_train_size = len(data.loc[:end_train]),
         refit              = False,
     )

results = grid_search_forecaster(
              forecaster    = forecaster,
              y             = data.loc[:end_val, 'y'],
              cv            = cv,
              param_grid    = param_grid,
              lags_grid     = lags_grid,
              metric        = custom_metric,
              return_best   = True,
              n_jobs        = 'auto',
              verbose       = False,
              show_progress = True
          )

results.head(4)

lags grid:   0%|          | 0/3 [00:00<?, ?it/s]

params grid:   0%|          | 0/6 [00:00<?, ?it/s]

`Forecaster` refitted using the best-found lags and parameters, and the whole data set: 
  Lags: [ 1  2  3 20] 
  Parameters: {'max_depth': 15, 'n_estimators': 100}
  Backtesting metric: 0.0681822427249296

Out[10]:

	lags	lags_label	params	custom_metric	max_depth	n_estimators
0	[1, 2, 3, 20]	[1, 2, 3, 20]	{'max_depth': 15, 'n_estimators': 100}	0.068182	15	100
1	[1, 2, 3, 20]	[1, 2, 3, 20]	{'max_depth': 10, 'n_estimators': 100}	0.068182	10	100
2	[1, 2, 3, 20]	[1, 2, 3, 20]	{'max_depth': 5, 'n_estimators': 100}	0.068182	5	100
3	[1, 2, 3]	[1, 2, 3]	{'max_depth': 5, 'n_estimators': 100}	0.070472	5	100

Compare multiple metrics¶

All three functions (grid_search_forecaster, random_search_forecaster, and bayesian_search_forecaster) allow the calculation of multiple metrics for each forecaster configuration if a list is provided. This list may include custom metrics and the best model selection is done based on the first metric of the list.

All three functions (grid_search_forecaster, random_search_forecaster, and bayesian_search_forecaster) enable users to calculate multiple metrics for each forecaster configuration if a list is provided. This list may include any combination of built-in metrics, such as mean_squared_error, mean_absolute_error, and mean_absolute_percentage_error, as well as user-defined custom metrics.

Note that if multiple metrics are specified, these functions will select the best model based on the first metric in the list.

In [11]:

Copied!





# Grid search hyperparameter and lags with multiple metrics
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = LGBMRegressor(random_state=123, verbose=-1),
                 lags      = 10  # Placeholder, the value will be overwritten
             )

# Lags used as predictors
lags_grid = [3, 10, [1, 2, 3, 20]]

# Regressor hyperparameters
param_grid = {
    'n_estimators': [50, 100],
    'max_depth': [5, 10, 15]
}

# Folds
cv = TimeSeriesFold(
         steps              = 12,
         initial_train_size = len(data.loc[:end_train]),
         refit              = False,
     )

results = grid_search_forecaster(
              forecaster    = forecaster,
              y             = data.loc[:end_val, 'y'],
              param_grid    = param_grid,
              lags_grid     = lags_grid,
              cv            = cv,
              metric        = ['mean_absolute_error', mean_squared_error, custom_metric],
              return_best   = True,
              n_jobs        = 'auto',
              verbose       = False,
              show_progress = True
          )

results.head(4)
# Grid search hyperparameter and lags with multiple metrics
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = LGBMRegressor(random_state=123, verbose=-1),
                 lags      = 10  # Placeholder, the value will be overwritten
             )

# Lags used as predictors
lags_grid = [3, 10, [1, 2, 3, 20]]

# Regressor hyperparameters
param_grid = {
    'n_estimators': [50, 100],
    'max_depth': [5, 10, 15]
}

# Folds
cv = TimeSeriesFold(
         steps              = 12,
         initial_train_size = len(data.loc[:end_train]),
         refit              = False,
     )

results = grid_search_forecaster(
              forecaster    = forecaster,
              y             = data.loc[:end_val, 'y'],
              param_grid    = param_grid,
              lags_grid     = lags_grid,
              cv            = cv,
              metric        = ['mean_absolute_error', mean_squared_error, custom_metric],
              return_best   = True,
              n_jobs        = 'auto',
              verbose       = False,
              show_progress = True
          )

results.head(4)

lags grid:   0%|          | 0/3 [00:00<?, ?it/s]

params grid:   0%|          | 0/6 [00:00<?, ?it/s]

`Forecaster` refitted using the best-found lags and parameters, and the whole data set: 
  Lags: [1 2 3] 
  Parameters: {'max_depth': 5, 'n_estimators': 100}
  Backtesting metric: 0.18359367014650177

Out[11]:

	lags	lags_label	params	mean_absolute_error	mean_squared_error	custom_metric	max_depth	n_estimators
0	[1, 2, 3]	[1, 2, 3]	{'max_depth': 5, 'n_estimators': 100}	0.183594	0.043875	0.070472	5	100
1	[1, 2, 3]	[1, 2, 3]	{'max_depth': 10, 'n_estimators': 100}	0.183594	0.043875	0.070472	10	100
2	[1, 2, 3]	[1, 2, 3]	{'max_depth': 15, 'n_estimators': 100}	0.183594	0.043875	0.070472	15	100
3	[1, 2, 3, 20]	[1, 2, 3, 20]	{'max_depth': 15, 'n_estimators': 100}	0.184901	0.044074	0.068182	15	100

Compare multiple regressors¶

The grid search process can be easily extended to compare several machine learning models. This can be achieved by using a simple for loop that iterates over each regressor and applying the grid_search_forecaster function. This approach allows for a more thorough exploration and can help you select the best model.

In [12]:

Copied!





# Models to compare
from sklearn.ensemble import RandomForestRegressor
from lightgbm import LGBMRegressor
from sklearn.linear_model import Ridge

models = [
    RandomForestRegressor(random_state=123), 
    LGBMRegressor(random_state=123, verbose=-1),
    Ridge(random_state=123)
]

# Hyperparameter to search for each model
param_grids = {
    'RandomForestRegressor': {'n_estimators': [50, 100], 'max_depth': [5, 15]},
    'LGBMRegressor': {'n_estimators': [20, 50], 'max_depth': [5, 10]},
    'Ridge': {'alpha': [0.01, 0.1, 1]}
}

# Lags used as predictors
lags_grid = [3, 5]

# Folds
cv = TimeSeriesFold(
         steps              = 3,
         initial_train_size = len(data.loc[:end_train]),
         refit              = False,
     )

df_results = pd.DataFrame()
for i, model in enumerate(models):

    print(f"Grid search for regressor: {model}")
    print("-------------------------")

    forecaster = ForecasterRecursive(
                     regressor = model,
                     lags      = 3
                 )

    # Regressor hyperparameters
    param_grid = param_grids[list(param_grids)[i]]

    results = grid_search_forecaster(
                  forecaster    = forecaster,
                  y             = data.loc[:end_val, 'y'],
                  param_grid    = param_grid,
                  lags_grid     = lags_grid,
                  cv            = cv,
                  metric        = 'mean_squared_error',
                  return_best   = False,
                  n_jobs        = 'auto',
                  verbose       = False,
                  show_progress = True
              )
    
    # Create a column with model name
    results['model'] = list(param_grids)[i]
    
    df_results = pd.concat([df_results, results])

df_results = df_results.sort_values(by='mean_squared_error')
df_results.head(10)
# Models to compare
from sklearn.ensemble import RandomForestRegressor
from lightgbm import LGBMRegressor
from sklearn.linear_model import Ridge

models = [
    RandomForestRegressor(random_state=123), 
    LGBMRegressor(random_state=123, verbose=-1),
    Ridge(random_state=123)
]

# Hyperparameter to search for each model
param_grids = {
    'RandomForestRegressor': {'n_estimators': [50, 100], 'max_depth': [5, 15]},
    'LGBMRegressor': {'n_estimators': [20, 50], 'max_depth': [5, 10]},
    'Ridge': {'alpha': [0.01, 0.1, 1]}
}

# Lags used as predictors
lags_grid = [3, 5]

# Folds
cv = TimeSeriesFold(
         steps              = 3,
         initial_train_size = len(data.loc[:end_train]),
         refit              = False,
     )

df_results = pd.DataFrame()
for i, model in enumerate(models):

    print(f"Grid search for regressor: {model}")
    print("-------------------------")

    forecaster = ForecasterRecursive(
                     regressor = model,
                     lags      = 3
                 )

    # Regressor hyperparameters
    param_grid = param_grids[list(param_grids)[i]]

    results = grid_search_forecaster(
                  forecaster    = forecaster,
                  y             = data.loc[:end_val, 'y'],
                  param_grid    = param_grid,
                  lags_grid     = lags_grid,
                  cv            = cv,
                  metric        = 'mean_squared_error',
                  return_best   = False,
                  n_jobs        = 'auto',
                  verbose       = False,
                  show_progress = True
              )
    
    # Create a column with model name
    results['model'] = list(param_grids)[i]
    
    df_results = pd.concat([df_results, results])

df_results = df_results.sort_values(by='mean_squared_error')
df_results.head(10)

Grid search for regressor: RandomForestRegressor(random_state=123)
-------------------------

lags grid:   0%|          | 0/2 [00:00<?, ?it/s]

params grid:   0%|          | 0/4 [00:00<?, ?it/s]

Grid search for regressor: LGBMRegressor(random_state=123, verbose=-1)
-------------------------

lags grid:   0%|          | 0/2 [00:00<?, ?it/s]

params grid:   0%|          | 0/4 [00:00<?, ?it/s]

Grid search for regressor: Ridge(random_state=123)
-------------------------

lags grid:   0%|          | 0/2 [00:00<?, ?it/s]

params grid:   0%|          | 0/3 [00:00<?, ?it/s]

Out[12]:

	lags	lags_label	params	mean_squared_error	max_depth	n_estimators	model	alpha
0	[1, 2, 3, 4, 5]	[1, 2, 3, 4, 5]	{'max_depth': 5, 'n_estimators': 50}	0.050180	5.0	50.0	LGBMRegressor	NaN
1	[1, 2, 3, 4, 5]	[1, 2, 3, 4, 5]	{'max_depth': 10, 'n_estimators': 50}	0.050180	10.0	50.0	LGBMRegressor	NaN
2	[1, 2, 3]	[1, 2, 3]	{'max_depth': 5, 'n_estimators': 50}	0.050907	5.0	50.0	LGBMRegressor	NaN
3	[1, 2, 3]	[1, 2, 3]	{'max_depth': 10, 'n_estimators': 50}	0.050907	10.0	50.0	LGBMRegressor	NaN
5	[1, 2, 3]	[1, 2, 3]	{'max_depth': 10, 'n_estimators': 20}	0.056990	10.0	20.0	LGBMRegressor	NaN
4	[1, 2, 3]	[1, 2, 3]	{'max_depth': 5, 'n_estimators': 20}	0.056990	5.0	20.0	LGBMRegressor	NaN
7	[1, 2, 3, 4, 5]	[1, 2, 3, 4, 5]	{'max_depth': 10, 'n_estimators': 20}	0.057542	10.0	20.0	LGBMRegressor	NaN
6	[1, 2, 3, 4, 5]	[1, 2, 3, 4, 5]	{'max_depth': 5, 'n_estimators': 20}	0.057542	5.0	20.0	LGBMRegressor	NaN
0	[1, 2, 3]	[1, 2, 3]	{'alpha': 0.01}	0.059814	NaN	NaN	Ridge	0.01
1	[1, 2, 3]	[1, 2, 3]	{'alpha': 0.1}	0.060078	NaN	NaN	Ridge	0.10

Saving results to file¶

The results of the hyperparameter search process can be saved to a file by setting the output_file argument to the desired path. The results will be saved in a tab-separated values (TSV) format containing the hyperparameters, lags, and metrics of each configuration evaluated during the search.

The saving process occurs after each hyperparameter evaluation, which means that if the optimization is stopped in the middle of the process, the logs of the first part of the evaluation have already been stored in the file. This can be useful for further analysis or to keep a record of the tuning process.

In [13]:

Copied!





# Save results to file
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = LGBMRegressor(random_state=123, verbose=-1),
                 lags      = 10  # Placeholder, the value will be overwritten
             )

# Lags used as predictors
lags_grid = [3, 10, [1, 2, 3, 20]]

# Regressor hyperparameters
param_grid = {
    'n_estimators': [50, 100],
    'max_depth': [5, 10, 15]
}

# Folds
cv = TimeSeriesFold(
         steps              = 12,
         initial_train_size = len(data.loc[:end_train]),
         refit              = False
     )

results = grid_search_forecaster(
              forecaster    = forecaster,
              y             = data.loc[:end_val, 'y'],
              param_grid    = param_grid,
              lags_grid     = lags_grid,
              cv            = cv,
              metric        = 'mean_squared_error',
              return_best   = True,
              n_jobs        = 'auto',
              verbose       = False,
              show_progress = True,
              output_file   = "results_grid_search.txt"
          )
# Save results to file
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = LGBMRegressor(random_state=123, verbose=-1),
                 lags      = 10  # Placeholder, the value will be overwritten
             )

# Lags used as predictors
lags_grid = [3, 10, [1, 2, 3, 20]]

# Regressor hyperparameters
param_grid = {
    'n_estimators': [50, 100],
    'max_depth': [5, 10, 15]
}

# Folds
cv = TimeSeriesFold(
         steps              = 12,
         initial_train_size = len(data.loc[:end_train]),
         refit              = False
     )

results = grid_search_forecaster(
              forecaster    = forecaster,
              y             = data.loc[:end_val, 'y'],
              param_grid    = param_grid,
              lags_grid     = lags_grid,
              cv            = cv,
              metric        = 'mean_squared_error',
              return_best   = True,
              n_jobs        = 'auto',
              verbose       = False,
              show_progress = True,
              output_file   = "results_grid_search.txt"
          )

lags grid:   0%|          | 0/3 [00:00<?, ?it/s]

params grid:   0%|          | 0/6 [00:00<?, ?it/s]

`Forecaster` refitted using the best-found lags and parameters, and the whole data set: 
  Lags: [1 2 3] 
  Parameters: {'max_depth': 5, 'n_estimators': 100}
  Backtesting metric: 0.04387531272712768

In [14]:

Copied!

# Read results file
# ==============================================================================
pd.read_csv("results_grid_search.txt", sep="\t")
# Read results file
# ==============================================================================
pd.read_csv("results_grid_search.txt", sep="\t")

Out[14]:

	lags	lags_label	params	mean_squared_error	max_depth	n_estimators
0	[1 2 3]	[1 2 3]	{'max_depth': 5, 'n_estimators': 50}	0.045423	5	50
1	[1 2 3]	[1 2 3]	{'max_depth': 5, 'n_estimators': 100}	0.043875	5	100
2	[1 2 3]	[1 2 3]	{'max_depth': 10, 'n_estimators': 50}	0.045423	10	50
3	[1 2 3]	[1 2 3]	{'max_depth': 10, 'n_estimators': 100}	0.043875	10	100
4	[1 2 3]	[1 2 3]	{'max_depth': 15, 'n_estimators': 50}	0.045423	15	50
5	[1 2 3]	[1 2 3]	{'max_depth': 15, 'n_estimators': 100}	0.043875	15	100
6	[ 1 2 3 4 5 6 7 8 9 10]	[ 1 2 3 4 5 6 7 8 9 10]	{'max_depth': 5, 'n_estimators': 50}	0.051399	5	50
7	[ 1 2 3 4 5 6 7 8 9 10]	[ 1 2 3 4 5 6 7 8 9 10]	{'max_depth': 5, 'n_estimators': 100}	0.047896	5	100
8	[ 1 2 3 4 5 6 7 8 9 10]	[ 1 2 3 4 5 6 7 8 9 10]	{'max_depth': 10, 'n_estimators': 50}	0.051399	10	50
9	[ 1 2 3 4 5 6 7 8 9 10]	[ 1 2 3 4 5 6 7 8 9 10]	{'max_depth': 10, 'n_estimators': 100}	0.047896	10	100
10	[ 1 2 3 4 5 6 7 8 9 10]	[ 1 2 3 4 5 6 7 8 9 10]	{'max_depth': 15, 'n_estimators': 50}	0.051399	15	50
11	[ 1 2 3 4 5 6 7 8 9 10]	[ 1 2 3 4 5 6 7 8 9 10]	{'max_depth': 15, 'n_estimators': 100}	0.047896	15	100
12	[ 1 2 3 20]	[ 1 2 3 20]	{'max_depth': 5, 'n_estimators': 50}	0.046221	5	50
13	[ 1 2 3 20]	[ 1 2 3 20]	{'max_depth': 5, 'n_estimators': 100}	0.044074	5	100
14	[ 1 2 3 20]	[ 1 2 3 20]	{'max_depth': 10, 'n_estimators': 50}	0.046221	10	50
15	[ 1 2 3 20]	[ 1 2 3 20]	{'max_depth': 10, 'n_estimators': 100}	0.044074	10	100
16	[ 1 2 3 20]	[ 1 2 3 20]	{'max_depth': 15, 'n_estimators': 50}	0.046221	15	50
17	[ 1 2 3 20]	[ 1 2 3 20]	{'max_depth': 15, 'n_estimators': 100}	0.044074	15	100