model_selection_multiseries
¶
backtesting_forecaster_multiseries(forecaster, series, steps, level, metric, initial_train_size, fixed_train_size=True, exog=None, refit=False, interval=None, n_boot=500, random_state=123, in_sample_residuals=True, verbose=False)
¶
Backtesting of forecaster multiseries model.
If refit
is False, the model is trained only once using the initial_train_size
first observations. If refit
is True, the model is trained in each iteration
increasing the training set. A copy of the original forecaster is created so
it is not modified during the process.
Parameters:
Name  Type  Description  Default 

forecaster 
ForecasterAutoregMultiSeries 
Forecaster model. 
required 
series 
DataFrame 
Training time series. 
required 
steps 
int 
Number of steps to predict. 
required 
level 
str 
Time series to be predicted. 
required 
metric 
Union[str, <builtin function callable>, list] 
Metric used to quantify the goodness of fit of the model. If string: {'mean_squared_error', 'mean_absolute_error', 'mean_absolute_percentage_error', 'mean_squared_log_error'} If callable: Function with arguments y_true, y_pred that returns a float. If list: List containing several strings and/or callable. 
required 
initial_train_size 
Optional[int] 
Number of samples in the initial train split. If

required 
fixed_train_size 
bool 
If True, train size doesn't increases but moves by 
True 
exog 
Union[pandas.core.series.Series, pandas.core.frame.DataFrame] 
Exogenous variable/s included as predictor/s. Must have the same
number of observations as 
None 
refit 
bool 
Whether to refit the forecaster in each iteration. 
False 
interval 
Optional[list] 
Confidence of the prediction interval estimated. Sequence of percentiles
to compute, which must be between 0 and 100 inclusive. If 
None 
n_boot 
int 
Number of bootstrapping iterations used to estimate prediction intervals. 
500 
random_state 
int 
Sets a seed to the random generator, so that boot intervals are always deterministic. 
123 
in_sample_residuals 
bool 
If 
True 
verbose 
bool 
Print number of folds and index of training and validation sets used for backtesting. 
False 
Returns:
Type  Description 

Tuple[Union[float, list], pandas.core.frame.DataFrame] 
Value(s) of the metric(s). 
Source code in skforecast/model_selection_multiseries/model_selection_multiseries.py
def backtesting_forecaster_multiseries(
forecaster,
series: pd.DataFrame,
steps: int,
level: str,
metric: Union[str, callable, list],
initial_train_size: Optional[int],
fixed_train_size: bool=True,
exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
refit: bool=False,
interval: Optional[list]=None,
n_boot: int=500,
random_state: int=123,
in_sample_residuals: bool=True,
verbose: bool=False
) > Tuple[Union[float, list], pd.DataFrame]:
"""
Backtesting of forecaster multiseries model.
If `refit` is False, the model is trained only once using the `initial_train_size`
first observations. If `refit` is True, the model is trained in each iteration
increasing the training set. A copy of the original forecaster is created so
it is not modified during the process.
Parameters

forecaster : ForecasterAutoregMultiSeries
Forecaster model.
series : pandas DataFrame
Training time series.
steps : int
Number of steps to predict.
level : str
Time series to be predicted.
metric : str, callable, list
Metric used to quantify the goodness of fit of the model.
If string:
{'mean_squared_error', 'mean_absolute_error',
'mean_absolute_percentage_error', 'mean_squared_log_error'}
If callable:
Function with arguments y_true, y_pred that returns a float.
If list:
List containing several strings and/or callable.
initial_train_size : int, default `None`
Number of samples in the initial train split. If `None` and `forecaster` is already
trained, no initial train is done and all data is used to evaluate the model. However,
the first `len(forecaster.last_window)` observations are needed to create the
initial predictors, so no predictions are calculated for them.
`None` is only allowed when `refit` is `False`.
fixed_train_size : bool, default `True`
If True, train size doesn't increases but moves by `steps` in each iteration.
exog : pandas Series, pandas DataFrame, default `None`
Exogenous variable/s included as predictor/s. Must have the same
number of observations as `y` and should be aligned so that y[i] is
regressed on exog[i].
refit : bool, default `False`
Whether to refit the forecaster in each iteration.
interval : list, default `None`
Confidence of the prediction interval estimated. Sequence of percentiles
to compute, which must be between 0 and 100 inclusive. If `None`, no
intervals are estimated. Only available for forecaster of type ForecasterAutoreg
and ForecasterAutoregCustom.
n_boot : int, default `500`
Number of bootstrapping iterations used to estimate prediction
intervals.
random_state : int, default `123`
Sets a seed to the random generator, so that boot intervals are always
deterministic.
in_sample_residuals : bool, default `True`
If `True`, residuals from the training data are used as proxy of
prediction error to create prediction intervals. If `False`, out_sample_residuals
are used if they are already stored inside the forecaster.
verbose : bool, default `False`
Print number of folds and index of training and validation sets used for backtesting.
Returns

metrics_value : float, list
Value(s) of the metric(s).
backtest_predictions : pandas DataFrame
Value of predictions and their estimated interval if `interval` is not `None`.
column pred = predictions.
column lower_bound = lower bound of the interval.
column upper_bound = upper bound interval of the interval.
"""
if initial_train_size is not None and initial_train_size > len(series):
raise Exception(
'If used, `initial_train_size` must be smaller than length of `series`.'
)
if initial_train_size is not None and initial_train_size < forecaster.window_size:
raise Exception(
f"`initial_train_size` must be greater than "
f"forecaster's window_size ({forecaster.window_size})."
)
if initial_train_size is None and not forecaster.fitted:
raise Exception(
'`forecaster` must be already trained if no `initial_train_size` is provided.'
)
if not isinstance(refit, bool):
raise Exception(
f'`refit` must be boolean: True, False.'
)
if initial_train_size is None and refit:
raise Exception(
f'`refit` is only allowed when there is a initial_train_size.'
)
if not isinstance(forecaster, ForecasterAutoregMultiSeries):
raise Exception(
('`forecaster` must be of type `ForecasterAutoregMultiSeries`, for all other '
'types of forecasters use the functions in `model_selection` module.')
)
if refit:
metrics_values, backtest_predictions = _backtesting_forecaster_multiseries_refit(
forecaster = forecaster,
series = series,
steps = steps,
level = level,
metric = metric,
initial_train_size = initial_train_size,
fixed_train_size = fixed_train_size,
exog = exog,
interval = interval,
n_boot = n_boot,
random_state = random_state,
in_sample_residuals = in_sample_residuals,
verbose = verbose
)
else:
metrics_values, backtest_predictions = _backtesting_forecaster_multiseries_no_refit(
forecaster = forecaster,
series = series,
steps = steps,
level = level,
metric = metric,
initial_train_size = initial_train_size,
exog = exog,
interval = interval,
n_boot = n_boot,
random_state = random_state,
in_sample_residuals = in_sample_residuals,
verbose = verbose
)
return metrics_values, backtest_predictions
grid_search_forecaster_multiseries(forecaster, series, param_grid, steps, metric, initial_train_size, fixed_train_size=True, levels_list=None, levels_weights=None, exog=None, lags_grid=None, refit=False, return_best=True, verbose=True)
¶
Exhaustive search over specified parameter values for a Forecaster object.
Validation is done using multiseries backtesting.
Parameters:
Name  Type  Description  Default 

forecaster 
ForecasterAutoregMultiSeries 
Forcaster model. 
required 
series 
DataFrame 
Training time series. 
required 
param_grid 
dict 
Dictionary with parameters names ( 
required 
steps 
int 
Number of steps to predict. 
required 
metric 
Union[str, <builtin function callable>] 
Metric used to quantify the goodness of fit of the model. If string: {'mean_squared_error', 'mean_absolute_error', 'mean_absolute_percentage_error', 'mean_squared_log_error'} If callable: Function with arguments y_true, y_pred that returns a float. 
required 
initial_train_size 
int 
Number of samples in the initial train split. 
required 
fixed_train_size 
bool 
If True, train size doesn't increases but moves by 
True 
levels_list 
Union[str, list] 
level ( 
None 
levels_weights 
dict 
Weights associated with levels in the form 
None 
exog 
Union[pandas.core.series.Series, pandas.core.frame.DataFrame] 
Exogenous variable/s included as predictor/s. Must have the same
number of observations as 
None 
lags_grid 
Optional[list] 
Lists of 
None 
refit 
bool 
Whether to refit the forecaster in each iteration of backtesting. 
False 
return_best 
bool 
Refit the 
True 
verbose 
bool 
Print number of folds used for cv or backtesting. 
True 
Returns:
Type  Description 

DataFrame 
Results for each combination of parameters. column lags = predictions. column params = lower bound of the interval. column metric = metric value estimated for the combination of parameters. additional n columns with param = value. 
Source code in skforecast/model_selection_multiseries/model_selection_multiseries.py
def grid_search_forecaster_multiseries(
forecaster,
series: pd.DataFrame,
param_grid: dict,
steps: int,
metric: Union[str, callable],
initial_train_size: int,
fixed_train_size: bool=True,
levels_list: Union[str, list]=None,
levels_weights: dict=None,
exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
lags_grid: Optional[list]=None,
refit: bool=False,
return_best: bool=True,
verbose: bool=True
) > pd.DataFrame:
"""
Exhaustive search over specified parameter values for a Forecaster object.
Validation is done using multiseries backtesting.
Parameters

forecaster : ForecasterAutoregMultiSeries
Forcaster model.
series : pandas DataFrame
Training time series.
param_grid : dict
Dictionary with parameters names (`str`) as keys and lists of parameter
settings to try as values.
steps : int
Number of steps to predict.
metric : str, callable
Metric used to quantify the goodness of fit of the model.
If string:
{'mean_squared_error', 'mean_absolute_error',
'mean_absolute_percentage_error', 'mean_squared_log_error'}
If callable:
Function with arguments y_true, y_pred that returns a float.
initial_train_size : int
Number of samples in the initial train split.
fixed_train_size : bool, default `True`
If True, train size doesn't increases but moves by `steps` in each iteration.
levels_list : str, list, default `None`
level (`str`) or levels (`list`) on which the forecaster is optimized.
If `None`, all levels are taken into acount. The resulting metric will be
a weighted average of the optimization of all levels. See also `levels_weights`.
levels_weights : dict, default `None`
Weights associated with levels in the form `{level: weight}`.
If `None`, all levels have the same weight.
exog : pandas Series, pandas DataFrame, default `None`
Exogenous variable/s included as predictor/s. Must have the same
number of observations as `y` and should be aligned so that y[i] is
regressed on exog[i].
lags_grid : list of int, lists, np.narray or range, default `None`
Lists of `lags` to try. Only used if forecaster is an instance of
`ForecasterAutoreg`, `ForecasterAutoregDirect` or `ForecasterAutoregMultiOutput`.
refit : bool, default `False`
Whether to refit the forecaster in each iteration of backtesting.
return_best : bool, default `True`
Refit the `forecaster` using the best found parameters on the whole data.
verbose : bool, default `True`
Print number of folds used for cv or backtesting.
Returns

results : pandas DataFrame
Results for each combination of parameters.
column lags = predictions.
column params = lower bound of the interval.
column metric = metric value estimated for the combination of parameters.
additional n columns with param = value.
"""
param_grid = list(ParameterGrid(param_grid))
results = _evaluate_grid_hyperparameters_multiseries(
forecaster = forecaster,
series = series,
param_grid = param_grid,
steps = steps,
metric = metric,
initial_train_size = initial_train_size,
fixed_train_size = fixed_train_size,
levels_list = levels_list,
levels_weights = levels_weights,
exog = exog,
lags_grid = lags_grid,
refit = refit,
return_best = return_best,
verbose = verbose
)
return results