Hyperparameter tuning and lags selection¶

Hyperparameter tuning is a crucial aspect of developing accurate and effective machine learning models. In machine learning, hyperparameters are values that cannot be learned from data and must be set by the user before the model is trained. These hyperparameters can significantly impact the performance of the model, and tuning them carefully can improve its accuracy and generalization to new data. In the case of forecasting models, the lags included in the model can be considered as an additional hyperparameter.

Hyperparameter tuning involves systematically testing different values or combinations of hyperparameters (including lags) to find the optimal configuration that produces the best results. The Skforecast library offers various hyperparameter tuning strategies, including grid search, random search, and Bayesian search, that can be combined with backtesting to identify the optimal combination of lags and hyperparameters that achieve the best prediction performance.

Libraries¶

In [12]:

            
                Copied!
                
                    
                    
                
                

        
# Libraries
# ==============================================================================
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from sklearn.ensemble import RandomForestRegressor
from skforecast.ForecasterAutoreg import ForecasterAutoreg
from skforecast.model_selection import grid_search_forecaster
from skforecast.model_selection import  random_search_forecaster
from skforecast.model_selection import  bayesian_search_forecaster
from sklearn.metrics import mean_squared_error
# Libraries
# ==============================================================================
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from sklearn.ensemble import RandomForestRegressor
from skforecast.ForecasterAutoreg import ForecasterAutoreg
from skforecast.model_selection import grid_search_forecaster
from skforecast.model_selection import  random_search_forecaster
from skforecast.model_selection import  bayesian_search_forecaster
from sklearn.metrics import mean_squared_error

Data¶

In [13]:

            
                Copied!
                
                    
                    
                
                

        
# Download data
# ==============================================================================
url = ('https://raw.githubusercontent.com/JoaquinAmatRodrigo/skforecast/master/data/h2o.csv')
data = pd.read_csv(url, sep=',', header=0, names=['y', 'datetime'])

# Data preprocessing
# ==============================================================================
data['datetime'] = pd.to_datetime(data['datetime'], format='%Y/%m/%d')
data = data.set_index('datetime')
data = data.asfreq('MS')
data = data[['y']]
data = data.sort_index()

# Train-val-test dates
# ==============================================================================
end_train = '2001-01-01 23:59:00'
end_val = '2006-01-01 23:59:00'

print(f"Train dates      : {data.index.min()} --- {data.loc[:end_train].index.max()}  (n={len(data.loc[:end_train])})")
print(f"Validation dates : {data.loc[end_train:].index.min()} --- {data.loc[:end_val].index.max()}  (n={len(data.loc[end_train:end_val])})")
print(f"Test dates       : {data.loc[end_val:].index.min()} --- {data.index.max()}  (n={len(data.loc[end_val:])})")

# Plot
# ==============================================================================
fig, ax=plt.subplots(figsize=(7, 3))
data.loc[:end_train].plot(ax=ax, label='train')
data.loc[end_train:end_val].plot(ax=ax, label='validation')
data.loc[end_val:].plot(ax=ax, label='test')
ax.legend();
# Download data
# ==============================================================================
url = ('https://raw.githubusercontent.com/JoaquinAmatRodrigo/skforecast/master/data/h2o.csv')
data = pd.read_csv(url, sep=',', header=0, names=['y', 'datetime'])

# Data preprocessing
# ==============================================================================
data['datetime'] = pd.to_datetime(data['datetime'], format='%Y/%m/%d')
data = data.set_index('datetime')
data = data.asfreq('MS')
data = data[['y']]
data = data.sort_index()

# Train-val-test dates
# ==============================================================================
end_train = '2001-01-01 23:59:00'
end_val = '2006-01-01 23:59:00'

print(f"Train dates      : {data.index.min()} --- {data.loc[:end_train].index.max()}  (n={len(data.loc[:end_train])})")
print(f"Validation dates : {data.loc[end_train:].index.min()} --- {data.loc[:end_val].index.max()}  (n={len(data.loc[end_train:end_val])})")
print(f"Test dates       : {data.loc[end_val:].index.min()} --- {data.index.max()}  (n={len(data.loc[end_val:])})")

# Plot
# ==============================================================================
fig, ax=plt.subplots(figsize=(7, 3))
data.loc[:end_train].plot(ax=ax, label='train')
data.loc[end_train:end_val].plot(ax=ax, label='validation')
data.loc[end_val:].plot(ax=ax, label='test')
ax.legend();

Train dates      : 1991-07-01 00:00:00 --- 2001-01-01 00:00:00  (n=115)
Validation dates : 2001-02-01 00:00:00 --- 2006-01-01 00:00:00  (n=60)
Test dates       : 2006-02-01 00:00:00 --- 2008-06-01 00:00:00  (n=29)

Grid search¶

Grid search is a popular hyperparameter tuning technique that evaluate an exaustive list of combinations of hyperparameters and lags to find the optimal configuration for a forecasting model. To perform a grid search with the Skforecast library, two grids are needed: one with different lags (lags_grid) and another with the hyperparameters (param_grid).

The grid search process involves the following steps:

grid_search_forecaster creates a copy of the forecaster object and replaces the lags argument with the first option appearing in lags_grid.
The function validates all combinations of hyperparameters presented in param_grid using backtesting.
The function repeats these two steps until it has evaluated all possible combinations of lags and hyperparameters.
If return_best = True, the original forecaster is trained with the best lags and hyperparameters configuration found during the grid search process.

In [14]:

            
                Copied!
                
                    
                    
                
                

        
# Grid search hyperparameters and lags
# ==============================================================================
forecaster = ForecasterAutoreg(
                 regressor = RandomForestRegressor(random_state=123),
                 lags      = 10 # Placeholder, the value will be overwritten
             )

# Lags used as predictors
lags_grid = [3, 10, [1, 2, 3, 20]]

# Regressor hyperparameters
param_grid = {'n_estimators': [50, 100],
              'max_depth': [5, 10, 15]}

results_grid = grid_search_forecaster(
                   forecaster         = forecaster,
                   y                  = data.loc[:end_val, 'y'],
                   param_grid         = param_grid,
                   lags_grid          = lags_grid,
                   steps              = 12,
                   refit              = True,
                   metric             = 'mean_squared_error',
                   initial_train_size = len(data.loc[:end_train]),
                   fixed_train_size   = False,
                   return_best        = True,
                   verbose            = False
               )
# Grid search hyperparameters and lags
# ==============================================================================
forecaster = ForecasterAutoreg(
                 regressor = RandomForestRegressor(random_state=123),
                 lags      = 10 # Placeholder, the value will be overwritten
             )

# Lags used as predictors
lags_grid = [3, 10, [1, 2, 3, 20]]

# Regressor hyperparameters
param_grid = {'n_estimators': [50, 100],
              'max_depth': [5, 10, 15]}

results_grid = grid_search_forecaster(
                   forecaster         = forecaster,
                   y                  = data.loc[:end_val, 'y'],
                   param_grid         = param_grid,
                   lags_grid          = lags_grid,
                   steps              = 12,
                   refit              = True,
                   metric             = 'mean_squared_error',
                   initial_train_size = len(data.loc[:end_train]),
                   fixed_train_size   = False,
                   return_best        = True,
                   verbose            = False
               )

Number of models compared: 18.

loop lags_grid: 100%|███████████████████████████████████████| 3/3 [00:13<00:00,  4.65s/it]

`Forecaster` refitted using the best-found lags and parameters, and the whole data set: 
  Lags: [ 1  2  3  4  5  6  7  8  9 10] 
  Parameters: {'max_depth': 5, 'n_estimators': 50}
  Backtesting metric: 0.03344857370906804

In [15]:

            
                Copied!
                
results_grid
results_grid

Out[15]:

	lags	params	mean_squared_error	max_depth	n_estimators
6	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	{'max_depth': 5, 'n_estimators': 50}	0.033449	5	50
8	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	{'max_depth': 10, 'n_estimators': 50}	0.039221	10	50
11	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	{'max_depth': 15, 'n_estimators': 100}	0.039266	15	100
7	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	{'max_depth': 5, 'n_estimators': 100}	0.039526	5	100
9	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	{'max_depth': 10, 'n_estimators': 100}	0.040241	10	100
10	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	{'max_depth': 15, 'n_estimators': 50}	0.040765	15	50
17	[1, 2, 3, 20]	{'max_depth': 15, 'n_estimators': 100}	0.043909	15	100
13	[1, 2, 3, 20]	{'max_depth': 5, 'n_estimators': 100}	0.044992	5	100
12	[1, 2, 3, 20]	{'max_depth': 5, 'n_estimators': 50}	0.046224	5	50
0	[1, 2, 3]	{'max_depth': 5, 'n_estimators': 50}	0.048666	5	50
15	[1, 2, 3, 20]	{'max_depth': 10, 'n_estimators': 100}	0.048991	10	100
14	[1, 2, 3, 20]	{'max_depth': 10, 'n_estimators': 50}	0.050193	10	50
5	[1, 2, 3]	{'max_depth': 15, 'n_estimators': 100}	0.050556	15	100
16	[1, 2, 3, 20]	{'max_depth': 15, 'n_estimators': 50}	0.051217	15	50
1	[1, 2, 3]	{'max_depth': 5, 'n_estimators': 100}	0.053123	5	100
4	[1, 2, 3]	{'max_depth': 15, 'n_estimators': 50}	0.060260	15	50
2	[1, 2, 3]	{'max_depth': 10, 'n_estimators': 50}	0.060951	10	50
3	[1, 2, 3]	{'max_depth': 10, 'n_estimators': 100}	0.067334	10	100

Since return_best = True, the forecaster object is updated with the best configuration found and trained with the whole data set. This means that the final model obtained from grid search will have the best combination of lags and hyperparameters that resulted in the highest performance metric. This final model can then be used for future predictions on new data.

In [16]:

            
                Copied!
                
forecaster
forecaster

Out[16]:

================= 
ForecasterAutoreg 
================= 
Regressor: RandomForestRegressor(max_depth=5, n_estimators=50, random_state=123) 
Lags: [ 1  2  3  4  5  6  7  8  9 10] 
Transformer for y: None 
Transformer for exog: None 
Window size: 10 
Weight function included: False 
Exogenous included: False 
Type of exogenous variable: None 
Exogenous variables names: None 
Training range: [Timestamp('1991-07-01 00:00:00'), Timestamp('2006-01-01 00:00:00')] 
Training index type: DatetimeIndex 
Training index frequency: MS 
Regressor parameters: {'bootstrap': True, 'ccp_alpha': 0.0, 'criterion': 'squared_error', 'max_depth': 5, 'max_features': 1.0, 'max_leaf_nodes': None, 'max_samples': None, 'min_impurity_decrease': 0.0, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'n_estimators': 50, 'n_jobs': None, 'oob_score': False, 'random_state': 123, 'verbose': 0, 'warm_start': False} 
Creation date: 2023-04-07 00:01:41 
Last fit date: 2023-04-07 00:01:55 
Skforecast version: 0.7.0 
Python version: 3.10.0 
Forecaster id: None

Random search¶

Random search is another hyperparameter tuning strategy available in the Skforecast library. In contrast to grid search, which tries out all possible combinations of hyperparameters and lags, randomized search samples a fixed number of values from the specified possibilities. The number of combinations that are evaluated is given by n_iter.

It is important to note that random sampling is only applied to the model hyperparameters, but not to the lags. All lags specified by the user are evaluated.

In [17]:

            
                Copied!
                
                    
                    
                
                

        
# Random search hyperparameters and lags
# ==============================================================================
forecaster = ForecasterAutoreg(
                 regressor = RandomForestRegressor(random_state=123),
                 lags      = 10 # Placeholder, the value will be overwritten
             )

# Lags used as predictors
lags_grid = [3, 5]

# Regressor hyperparameters
param_distributions = {'n_estimators': np.arange(start=10, stop=100, step=1, dtype=int),
                       'max_depth': np.arange(start=5, stop=30, step=1, dtype=int)}

results = random_search_forecaster(
              forecaster           = forecaster,
              y                    = data.loc[:end_val, 'y'],
              steps                = 12,
              lags_grid            = lags_grid,
              param_distributions  = param_distributions,
              n_iter               = 5,
              metric               = 'mean_squared_error',
              refit                = True,
              initial_train_size   = len(data.loc[:end_train]),
              fixed_train_size     = False,
              return_best          = True,
              random_state         = 123,
              verbose              = False
          )

results_grid.head(4)
# Random search hyperparameters and lags
# ==============================================================================
forecaster = ForecasterAutoreg(
                 regressor = RandomForestRegressor(random_state=123),
                 lags      = 10 # Placeholder, the value will be overwritten
             )

# Lags used as predictors
lags_grid = [3, 5]

# Regressor hyperparameters
param_distributions = {'n_estimators': np.arange(start=10, stop=100, step=1, dtype=int),
                       'max_depth': np.arange(start=5, stop=30, step=1, dtype=int)}

results = random_search_forecaster(
              forecaster           = forecaster,
              y                    = data.loc[:end_val, 'y'],
              steps                = 12,
              lags_grid            = lags_grid,
              param_distributions  = param_distributions,
              n_iter               = 5,
              metric               = 'mean_squared_error',
              refit                = True,
              initial_train_size   = len(data.loc[:end_train]),
              fixed_train_size     = False,
              return_best          = True,
              random_state         = 123,
              verbose              = False
          )

results_grid.head(4)

Number of models compared: 10.

loop lags_grid: 100%|███████████████████████████████████████| 2/2 [00:07<00:00,  3.88s/it]

`Forecaster` refitted using the best-found lags and parameters, and the whole data set: 
  Lags: [1 2 3 4 5] 
  Parameters: {'n_estimators': 77, 'max_depth': 17}
  Backtesting metric: 0.03147248676391345

Out[17]:

	lags	params	mean_squared_error	max_depth	n_estimators
6	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	{'max_depth': 5, 'n_estimators': 50}	0.033449	5	50
8	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	{'max_depth': 10, 'n_estimators': 50}	0.039221	10	50
11	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	{'max_depth': 15, 'n_estimators': 100}	0.039266	15	100
7	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	{'max_depth': 5, 'n_estimators': 100}	0.039526	5	100

Bayesian search¶

Grid and random search can generate good results, especially when the search range is narrowed down. However, neither of them takes into account the results obtained so far, which prevents them from focusing the search on the regions of greatest interest while avoiding unnecessary ones.

An alternative is to use Bayesian optimization methods to search for hyperparameters. In general terms, bayesian hyperparameter optimization consists of creating a probabilistic model in which the objective function is the model validation metric (RMSE, AUC, accuracy...). With this strategy, the search is redirected at each iteration to the regions of greatest interest. The ultimate goal is to reduce the number of hyperparameter combinations with which the model is evaluated, choosing only the best candidates. This approach is particularly advantageous when the search space is very large or the model evaluation is very slow.

It is worth noting that, in the context of skforecast, bayesian search is only applied to the hyperparameters of the model, and not to the lags, as all lags specified by the user are evaluated.

Skforecast offers two Bayesian optimization engines: Scikit-Optimize and Optuna. It is worth noting that bayesian search is only applied to the hyperparameters of the model, and not to the lags, as all lags specified by the user are evaluated.

Optuna¶

In skforecast, Bayesian optimization with Optuna is performed using its Study object. The objective of the optimization is to minimize the metric generated by backtesting.

Additional parameters can be included by passing a dictionary to kwargs_create_study and kwargs_study_optimize arguments to create_study and optimize method, respectively. These arguments are used to configure the study object and optimization algorithm.

To use Optuna in skforecast, the search_space argument must be a python function that defines the hyperparameters to optimize over. Optuna uses the Trial object object to generate each search space.

In [18]:

            
                Copied!
                
                    
                    
                
                

        
# Bayesian search hyperparameters and lags with Optuna
# ==============================================================================
forecaster = ForecasterAutoreg(
                 regressor = RandomForestRegressor(random_state=123),
                 lags      = 10 # Placeholder, the value will be overwritten
             )

# Lags used as predictors
lags_grid = [3, 5]

# Regressor hyperparameters search space
def search_space(trial):
    search_space  = {'n_estimators'     : trial.suggest_int('n_estimators', 10, 20),
                     'min_samples_leaf' : trial.suggest_int('min_samples_leaf', 1., 10),
                     'max_features'     : trial.suggest_categorical('max_features', ['log2', 'sqrt'])
                    } 
    return search_space

results, frozen_trial = bayesian_search_forecaster(
                            forecaster            = forecaster,
                            y                     = data.loc[:end_val, 'y'],
                            lags_grid             = lags_grid,
                            search_space          = search_space,
                            steps                 = 12,
                            metric                = 'mean_absolute_error',
                            refit                 = True,
                            initial_train_size    = len(data.loc[:end_train]),
                            fixed_train_size      = True,
                            n_trials              = 10,
                            random_state          = 123,
                            return_best           = False,
                            verbose               = False,
                            engine                = 'optuna',
                            kwargs_create_study   = {},
                            kwargs_study_optimize = {}
                        )

results_grid.head(4)
# Bayesian search hyperparameters and lags with Optuna
# ==============================================================================
forecaster = ForecasterAutoreg(
                 regressor = RandomForestRegressor(random_state=123),
                 lags      = 10 # Placeholder, the value will be overwritten
             )

# Lags used as predictors
lags_grid = [3, 5]

# Regressor hyperparameters search space
def search_space(trial):
    search_space  = {'n_estimators'     : trial.suggest_int('n_estimators', 10, 20),
                     'min_samples_leaf' : trial.suggest_int('min_samples_leaf', 1., 10),
                     'max_features'     : trial.suggest_categorical('max_features', ['log2', 'sqrt'])
                    } 
    return search_space

results, frozen_trial = bayesian_search_forecaster(
                            forecaster            = forecaster,
                            y                     = data.loc[:end_val, 'y'],
                            lags_grid             = lags_grid,
                            search_space          = search_space,
                            steps                 = 12,
                            metric                = 'mean_absolute_error',
                            refit                 = True,
                            initial_train_size    = len(data.loc[:end_train]),
                            fixed_train_size      = True,
                            n_trials              = 10,
                            random_state          = 123,
                            return_best           = False,
                            verbose               = False,
                            engine                = 'optuna',
                            kwargs_create_study   = {},
                            kwargs_study_optimize = {}
                        )

results_grid.head(4)

Number of models compared: 20,
         10 bayesian search in each lag configuration.

loop lags_grid: 100%|███████████████████████████████████████| 2/2 [00:03<00:00,  1.81s/it]

Out[18]:

	lags	params	mean_squared_error	max_depth	n_estimators
6	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	{'max_depth': 5, 'n_estimators': 50}	0.033449	5	50
8	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	{'max_depth': 10, 'n_estimators': 50}	0.039221	10	50
11	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	{'max_depth': 15, 'n_estimators': 100}	0.039266	15	100
7	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	{'max_depth': 5, 'n_estimators': 100}	0.039526	5	100

frozen_trial contains information of the trial which achived the best results: See more in Study class.

In [19]:

            
                Copied!
                
frozen_trial
frozen_trial

Out[19]:

FrozenTrial(number=0, state=TrialState.COMPLETE, values=[0.15057479525044887], datetime_start=datetime.datetime(2023, 4, 7, 0, 2, 3, 982299), datetime_complete=datetime.datetime(2023, 4, 7, 0, 2, 4, 256723), params={'n_estimators': 17, 'min_samples_leaf': 3, 'max_features': 'sqrt'}, user_attrs={}, system_attrs={}, intermediate_values={}, distributions={'n_estimators': IntDistribution(high=20, log=False, low=10, step=1), 'min_samples_leaf': IntDistribution(high=10, log=False, low=1, step=1), 'max_features': CategoricalDistribution(choices=('log2', 'sqrt'))}, trial_id=0, value=None)

Scikit-optimize¶

Skopt performs bayesian optimization with Gaussian processes. This is done with the function gp_minimize where the objective value to be minimized is calculated by backtesting.

Warning

This engine is deprecated since skforecast 0.7.0 in favor of optuna engine. To continue using it, install skforecast 0.6.0.

User guide: https://joaquinamatrodrigo.github.io/skforecast/0.6.0/user_guides/hyperparameter-tuning-and-lags-selection.html#scikit-optimize

Hyperparameter tuning with custom metric¶

Besides to the commonly used metrics such as mean_squared_error, mean_absolute_error, and mean_absolute_percentage_error, users have the flexibility to define their own custom metric function, provided that it includes the arguments y_true (the true values of the series) and y_pred (the predicted values), and returns a numeric value (either a float or an int).

This customizability enables users to evaluate the model's predictive performance in a wide range of scenarios, such as considering only certain months, days, non holiday; or focusing only on the last step of the predicted horizon.

To illustrate this, consider the following example: a 12-month horizon is forecasted, but the interest metric is calculated by considering only the last three months of each year. This is achieved by defining a custom metric function that takes into account only the relevant months, which is then passed as an argument to the backtesting function.

The example below demonstrates how to use hyperparameter optimization to find the optimal parameters for a custom metric that considers only the last three months of each year.

In [20]:

            
                Copied!
                
                    
                    
                
                

        
# Grid search hyperparameter and lags with custom metric
# ==============================================================================
def custom_metric(y_true, y_pred):
    """
    Calculate the mean squared error using only the predicted values of the last
    3 months of the year.
    """
    mask = y_true.index.month.isin([10, 11, 12])
    metric = mean_squared_error(y_true[mask], y_pred[mask])
    
    return metric
    
forecaster = ForecasterAutoreg(
                 regressor = RandomForestRegressor(random_state=123),
                 lags      = 10 # Placeholder, the value will be overwritten
             )

# Lags used as predictors
lags_grid = [3, 10, [1, 2, 3, 20]]

# Regressor hyperparameters
param_grid = {'n_estimators': [50, 100],
              'max_depth': [5, 10, 15]}

results_grid = grid_search_forecaster(
                   forecaster         = forecaster,
                   y                  = data.loc[:end_val, 'y'],
                   param_grid         = param_grid,
                   lags_grid          = lags_grid,
                   steps              = 12,
                   refit              = True,
                   metric             = custom_metric,
                   initial_train_size = len(data.loc[:end_train]),
                   fixed_train_size   = False,
                   return_best        = True,
                   verbose            = False
               )

results_grid.head(4)
# Grid search hyperparameter and lags with custom metric
# ==============================================================================
def custom_metric(y_true, y_pred):
    """
    Calculate the mean squared error using only the predicted values of the last
    3 months of the year.
    """
    mask = y_true.index.month.isin([10, 11, 12])
    metric = mean_squared_error(y_true[mask], y_pred[mask])
    
    return metric
    
forecaster = ForecasterAutoreg(
                 regressor = RandomForestRegressor(random_state=123),
                 lags      = 10 # Placeholder, the value will be overwritten
             )

# Lags used as predictors
lags_grid = [3, 10, [1, 2, 3, 20]]

# Regressor hyperparameters
param_grid = {'n_estimators': [50, 100],
              'max_depth': [5, 10, 15]}

results_grid = grid_search_forecaster(
                   forecaster         = forecaster,
                   y                  = data.loc[:end_val, 'y'],
                   param_grid         = param_grid,
                   lags_grid          = lags_grid,
                   steps              = 12,
                   refit              = True,
                   metric             = custom_metric,
                   initial_train_size = len(data.loc[:end_train]),
                   fixed_train_size   = False,
                   return_best        = True,
                   verbose            = False
               )

results_grid.head(4)

Number of models compared: 18.

loop lags_grid: 100%|███████████████████████████████████████| 3/3 [00:14<00:00,  4.72s/it]

`Forecaster` refitted using the best-found lags and parameters, and the whole data set: 
  Lags: [1 2 3] 
  Parameters: {'max_depth': 5, 'n_estimators': 50}
  Backtesting metric: 0.04867459231626605

Out[20]:

	lags	params	custom_metric	max_depth	n_estimators
0	[1, 2, 3]	{'max_depth': 5, 'n_estimators': 50}	0.048675	5	50
17	[1, 2, 3, 20]	{'max_depth': 15, 'n_estimators': 100}	0.052172	15	100
5	[1, 2, 3]	{'max_depth': 15, 'n_estimators': 100}	0.055920	15	100
12	[1, 2, 3, 20]	{'max_depth': 5, 'n_estimators': 50}	0.056981	5	50

Compare multiple metrics¶

All three functions (grid_search_forecaster, random_search_forecaster, and bayesian_search_forecaster) allow the calculation of multiple metrics for each forecaster configuration if a list is provided. This list may include custom metrics and the best model selection is done based on the first metric of the list.

All three functions (grid_search_forecaster, random_search_forecaster, and bayesian_search_forecaster) enable users to calculate multiple metrics for each forecaster configuration if a list is provided. This list may include any combination of built-in metrics, such as mean_squared_error, mean_absolute_error, and mean_absolute_percentage_error, as well as user-defined custom metrics.

It should be noted that when multiple metrics are specified, these functions will select the best model based on the first metric in the list.

In [21]:

            
                Copied!
                
                    
                    
                
                

        
# Grid search hyperparameter and lags with multiple metrics
# ==============================================================================
def custom_metric(y_true, y_pred):
    """
    Calculate the mean squared error using only the predicted values of the last
    3 months of the year.
    """
    mask = y_true.index.month.isin([10, 11, 12])
    metric = mean_squared_error(y_true[mask], y_pred[mask])
    
    return metric

forecaster = ForecasterAutoreg(
                 regressor = RandomForestRegressor(random_state=123),
                 lags      = 10 # Placeholder, the value will be overwritten
             )

# Metrics
metrics = ['mean_absolute_error', mean_squared_error, custom_metric]

# Lags used as predictors
lags_grid = [3, 10, [1, 2, 3, 20]]

# Regressor hyperparameters
param_grid = {'n_estimators': [50, 100],
              'max_depth': [5, 10, 15]}

results_grid = grid_search_forecaster(
                    forecaster         = forecaster,
                    y                  = data.loc[:end_val, 'y'],
                    param_grid         = param_grid,
                    lags_grid          = lags_grid,
                    steps              = 12,
                    refit              = True,
                    metric             = metrics,
                    initial_train_size = len(data.loc[:end_train]),
                    fixed_train_size   = False,
                    return_best        = True,
                    verbose            = False
               )

results_grid.head(4)
# Grid search hyperparameter and lags with multiple metrics
# ==============================================================================
def custom_metric(y_true, y_pred):
    """
    Calculate the mean squared error using only the predicted values of the last
    3 months of the year.
    """
    mask = y_true.index.month.isin([10, 11, 12])
    metric = mean_squared_error(y_true[mask], y_pred[mask])
    
    return metric

forecaster = ForecasterAutoreg(
                 regressor = RandomForestRegressor(random_state=123),
                 lags      = 10 # Placeholder, the value will be overwritten
             )

# Metrics
metrics = ['mean_absolute_error', mean_squared_error, custom_metric]

# Lags used as predictors
lags_grid = [3, 10, [1, 2, 3, 20]]

# Regressor hyperparameters
param_grid = {'n_estimators': [50, 100],
              'max_depth': [5, 10, 15]}

results_grid = grid_search_forecaster(
                    forecaster         = forecaster,
                    y                  = data.loc[:end_val, 'y'],
                    param_grid         = param_grid,
                    lags_grid          = lags_grid,
                    steps              = 12,
                    refit              = True,
                    metric             = metrics,
                    initial_train_size = len(data.loc[:end_train]),
                    fixed_train_size   = False,
                    return_best        = True,
                    verbose            = False
               )

results_grid.head(4)

Number of models compared: 18.

loop lags_grid: 100%|███████████████████████████████████████| 3/3 [00:12<00:00,  4.02s/it]

`Forecaster` refitted using the best-found lags and parameters, and the whole data set: 
  Lags: [ 1  2  3  4  5  6  7  8  9 10] 
  Parameters: {'max_depth': 5, 'n_estimators': 50}
  Backtesting metric: 0.14186925271863238

Out[21]:

	lags	params	mean_absolute_error	mean_squared_error	custom_metric	max_depth	n_estimators
6	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	{'max_depth': 5, 'n_estimators': 50}	0.141869	0.033449	0.057507	5	50
8	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	{'max_depth': 10, 'n_estimators': 50}	0.144912	0.039221	0.066599	10	50
10	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	{'max_depth': 15, 'n_estimators': 50}	0.152188	0.040765	0.085193	15	50
9	[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]	{'max_depth': 10, 'n_estimators': 100}	0.154576	0.040241	0.071078	10	100

Compare multiple regressors¶

The grid search process can be easily extended to compare several machine learning models. This can be achieved by using a simple for loop that iterates over each regressor and applying the grid_search_forecaster function. This approach allows for a more thorough exploration and can help you select the best model.

In [22]:

            
                Copied!
                
                    
                    
                
                

        
# Models to compare
from sklearn.ensemble import RandomForestRegressor
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.linear_model import Ridge

models = [RandomForestRegressor(random_state=123), 
          GradientBoostingRegressor(random_state=123),
          Ridge(random_state=123)]

# Hyperparameter to search for each model
param_grids = {'RandomForestRegressor': {'n_estimators': [50, 100], 'max_depth': [5, 15]},
               'GradientBoostingRegressor': {'n_estimators': [20, 50], 'max_depth': [5, 10]},
               'Ridge': {'alpha': [0.01, 0.1, 1]}}

# Lags used as predictors
lags_grid = [3, 5]

df_results = pd.DataFrame()
for i, model in enumerate(models):
    print(f"Grid search for regressor: {model}")
    print(f"-------------------------")

    forecaster = ForecasterAutoreg(
                     regressor = model,
                     lags      = 3
                 )

    # Regressor hyperparameters
    param_grid = param_grids[list(param_grids)[i]]

    results_grid = grid_search_forecaster(
                       forecaster         = forecaster,
                       y                  = data.loc[:end_val, 'y'],
                       param_grid         = param_grid,
                       lags_grid          = lags_grid,
                       steps              = 3,
                       refit              = True,
                       metric             = 'mean_squared_error',
                       initial_train_size = len(data.loc[:end_train]),
                       fixed_train_size   = True,
                       return_best        = False,
                       verbose            = False
                   )
    
    # Create a column with model name
    results_grid['model'] = list(param_grids)[i]
    
    df_results = pd.concat([df_results, results_grid])

df_results = df_results.sort_values(by='mean_squared_error')
df_results.head(10)
# Models to compare
from sklearn.ensemble import RandomForestRegressor
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.linear_model import Ridge

models = [RandomForestRegressor(random_state=123), 
          GradientBoostingRegressor(random_state=123),
          Ridge(random_state=123)]

# Hyperparameter to search for each model
param_grids = {'RandomForestRegressor': {'n_estimators': [50, 100], 'max_depth': [5, 15]},
               'GradientBoostingRegressor': {'n_estimators': [20, 50], 'max_depth': [5, 10]},
               'Ridge': {'alpha': [0.01, 0.1, 1]}}

# Lags used as predictors
lags_grid = [3, 5]

df_results = pd.DataFrame()
for i, model in enumerate(models):
    print(f"Grid search for regressor: {model}")
    print(f"-------------------------")

    forecaster = ForecasterAutoreg(
                     regressor = model,
                     lags      = 3
                 )

    # Regressor hyperparameters
    param_grid = param_grids[list(param_grids)[i]]

    results_grid = grid_search_forecaster(
                       forecaster         = forecaster,
                       y                  = data.loc[:end_val, 'y'],
                       param_grid         = param_grid,
                       lags_grid          = lags_grid,
                       steps              = 3,
                       refit              = True,
                       metric             = 'mean_squared_error',
                       initial_train_size = len(data.loc[:end_train]),
                       fixed_train_size   = True,
                       return_best        = False,
                       verbose            = False
                   )
    
    # Create a column with model name
    results_grid['model'] = list(param_grids)[i]
    
    df_results = pd.concat([df_results, results_grid])

df_results = df_results.sort_values(by='mean_squared_error')
df_results.head(10)

Grid search for regressor: RandomForestRegressor(random_state=123)
-------------------------
Number of models compared: 8.

loop lags_grid: 100%|███████████████████████████████████████| 2/2 [00:15<00:00,  7.60s/it]

Grid search for regressor: GradientBoostingRegressor(random_state=123)
-------------------------
Number of models compared: 8.

loop lags_grid: 100%|███████████████████████████████████████| 2/2 [00:03<00:00,  2.00s/it]

Grid search for regressor: Ridge(random_state=123)
-------------------------
Number of models compared: 6.

loop lags_grid: 100%|███████████████████████████████████████| 2/2 [00:00<00:00,  4.37it/s]

Out[22]:

	lags	params	mean_squared_error	max_depth	n_estimators	model	alpha
4	[1, 2, 3, 4, 5]	{'max_depth': 5, 'n_estimators': 50}	0.045461	5.0	50.0	RandomForestRegressor	NaN
6	[1, 2, 3, 4, 5]	{'max_depth': 15, 'n_estimators': 50}	0.045826	15.0	50.0	RandomForestRegressor	NaN
5	[1, 2, 3, 4, 5]	{'max_depth': 5, 'n_estimators': 100}	0.048252	5.0	100.0	RandomForestRegressor	NaN
7	[1, 2, 3, 4, 5]	{'max_depth': 15, 'n_estimators': 100}	0.048994	15.0	100.0	RandomForestRegressor	NaN
0	[1, 2, 3]	{'alpha': 0.01}	0.054196	NaN	NaN	Ridge	0.01
3	[1, 2, 3, 4, 5]	{'alpha': 0.01}	0.054255	NaN	NaN	Ridge	0.01
4	[1, 2, 3, 4, 5]	{'alpha': 0.1}	0.054266	NaN	NaN	Ridge	0.10
1	[1, 2, 3]	{'alpha': 0.1}	0.054287	NaN	NaN	Ridge	0.10
7	[1, 2, 3, 4, 5]	{'max_depth': 10, 'n_estimators': 50}	0.054291	10.0	50.0	GradientBoostingRegressor	NaN
6	[1, 2, 3, 4, 5]	{'max_depth': 10, 'n_estimators': 20}	0.054773	10.0	20.0	GradientBoostingRegressor	NaN

Hide progress bar¶

It is possible to hide the progress bar using the following code:

In [23]:

            
                Copied!
                
                    
                    
                
                

        
from tqdm import tqdm
from functools import partialmethod
tqdm.__init__ = partialmethod(tqdm.__init__, disable=True)

results_grid = grid_search_forecaster(
                   forecaster         = forecaster,
                   y                  = data.loc[:end_val, 'y'],
                   param_grid         = param_grid,
                   lags_grid          = lags_grid,
                   steps              = 12,
                   refit              = True,
                   metric             = 'mean_squared_error',
                   initial_train_size = len(data.loc[:end_train]),
                   fixed_train_size   = False,
                   return_best        = True,
                   verbose            = False
               )
from tqdm import tqdm
from functools import partialmethod
tqdm.__init__ = partialmethod(tqdm.__init__, disable=True)

results_grid = grid_search_forecaster(
                   forecaster         = forecaster,
                   y                  = data.loc[:end_val, 'y'],
                   param_grid         = param_grid,
                   lags_grid          = lags_grid,
                   steps              = 12,
                   refit              = True,
                   metric             = 'mean_squared_error',
                   initial_train_size = len(data.loc[:end_train]),
                   fixed_train_size   = False,
                   return_best        = True,
                   verbose            = False
               )

Number of models compared: 6.
`Forecaster` refitted using the best-found lags and parameters, and the whole data set: 
  Lags: [1 2 3] 
  Parameters: {'alpha': 1}
  Backtesting metric: 0.08758355918903007

In [24]:

            
                Copied!
                
%%html
<style>
.jupyter-wrapper .jp-CodeCell .jp-Cell-inputWrapper .jp-InputPrompt {display: none;}
</style>
%%html