`model_selection_multiseries`¶

`backtesting_forecaster_multiseries(forecaster, series, steps, metric, initial_train_size, fixed_train_size=True, gap=0, allow_incomplete_fold=True, levels=None, exog=None, refit=False, interval=None, n_boot=500, random_state=123, in_sample_residuals=True, verbose=False, show_progress=True)` ¶

Backtesting for multi-series and multivariate forecasters.

If refit is False, the model is trained only once using the initial_train_size first observations. If refit is True, the model is trained in each iteration increasing the training set. A copy of the original forecaster is created so it is not modified during the process.

Parameters:

Name	Type	Description	Default
`forecaster`	`ForecasterAutoregMultiSeries, ForecasterAutoregMultiSeriesCustom, ForecasterAutoregMultiVariate`	Forecaster model.	required
`series`	`DataFrame`	Training time series.	required
`steps`	`int`	Number of steps to predict.	required
`metric`	`Union[str, Callable, list]`	Metric used to quantify the goodness of fit of the model. If string: {'mean_squared_error', 'mean_absolute_error', 'mean_absolute_percentage_error', 'mean_squared_log_error'} If Callable: Function with arguments y_true, y_pred that returns a float. If list: List containing multiple strings and/or Callables.	required
`initial_train_size`	`Optional[int]`	Number of samples in the initial train split. If `None` and `forecaster` is already trained, no initial train is done and all data is used to evaluate the model. However, the first `len(forecaster.last_window)` observations are needed to create the initial predictors, so no predictions are calculated for them. `None` is only allowed when `refit` is `False`.	required
`fixed_train_size`	`bool`	If True, train size doesn't increase but moves by `steps` in each iteration.	`True`
`gap`	`int`	Number of samples to be excluded after the end of each training set and before the test set.	`0`
`allow_incomplete_fold`	`bool`	Last fold is allowed to have a smaller number of samples than the `test_size`. If `False`, the last fold is excluded.	`True`
`levels`	`Union[str, list]`	Time series to be predicted. If `None` all levels will be predicted. New in version 0.6.0	`None`
`exog`	`Union[pandas.core.series.Series, pandas.core.frame.DataFrame]`	Exogenous variable/s included as predictor/s. Must have the same number of observations as `y` and should be aligned so that y[i] is regressed on exog[i].	`None`
`refit`	`bool`	Whether to re-fit the forecaster in each iteration.	`False`
`interval`	`Optional[list]`	Confidence of the prediction interval estimated. Sequence of percentiles to compute, which must be between 0 and 100 inclusive. If `None`, no intervals are estimated. Only available for forecaster of type ForecasterAutoreg and ForecasterAutoregCustom.	`None`
`n_boot`	`int`	Number of bootstrapping iterations used to estimate prediction intervals.	`500`
`random_state`	`int`	Sets a seed to the random generator, so that boot intervals are always deterministic.	`123`
`in_sample_residuals`	`bool`	If `True`, residuals from the training data are used as proxy of prediction error to create prediction intervals. If `False`, out_sample_residuals are used if they are already stored inside the forecaster.	`True`
`verbose`	`bool`	Print number of folds and index of training and validation sets used for backtesting.	`False`
`show_progress`	`bool`	Whether to show a progress bar. Defaults to True.	`True`

Returns:

Type	Description
`Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame]`	Value(s) of the metric(s). Index are the levels and columns the metrics.

Source code in skforecast/model_selection_multiseries/model_selection_multiseries.py

def backtesting_forecaster_multiseries(
    forecaster,
    series: pd.DataFrame,
    steps: int,
    metric: Union[str, Callable, list],
    initial_train_size: Optional[int],
    fixed_train_size: bool=True,
    gap: int=0,
    allow_incomplete_fold: bool=True,
    levels: Optional[Union[str, list]]=None,
    exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
    refit: bool=False,
    interval: Optional[list]=None,
    n_boot: int=500,
    random_state: int=123,
    in_sample_residuals: bool=True,
    verbose: bool=False,
    show_progress: bool=True
) -> Tuple[pd.DataFrame, pd.DataFrame]:
    """
    Backtesting for multi-series and multivariate forecasters.

    If `refit` is False, the model is trained only once using the `initial_train_size`
    first observations. If `refit` is True, the model is trained in each iteration
    increasing the training set. A copy of the original forecaster is created so 
    it is not modified during the process.

    Parameters
    ----------
    forecaster : ForecasterAutoregMultiSeries, ForecasterAutoregMultiSeriesCustom, ForecasterAutoregMultiVariate
        Forecaster model.

    series : pandas DataFrame
        Training time series.

    steps : int
        Number of steps to predict.

    metric : str, Callable, list
        Metric used to quantify the goodness of fit of the model.

        If string:
            {'mean_squared_error', 'mean_absolute_error',
             'mean_absolute_percentage_error', 'mean_squared_log_error'}

        If Callable:
            Function with arguments y_true, y_pred that returns a float.

        If list:
            List containing multiple strings and/or Callables.

    initial_train_size : int, default `None`
        Number of samples in the initial train split. If `None` and `forecaster` is already 
        trained, no initial train is done and all data is used to evaluate the model. However, 
        the first `len(forecaster.last_window)` observations are needed to create the 
        initial predictors, so no predictions are calculated for them.

        `None` is only allowed when `refit` is `False`.

    fixed_train_size : bool, default `True`
        If True, train size doesn't increase but moves by `steps` in each iteration.

    gap : int, default `0`
        Number of samples to be excluded after the end of each training set and 
        before the test set.

    allow_incomplete_fold : bool, default `True`
        Last fold is allowed to have a smaller number of samples than the 
        `test_size`. If `False`, the last fold is excluded.

    levels : str, list, default `None`
        Time series to be predicted. If `None` all levels will be predicted.
        **New in version 0.6.0**

    exog : pandas Series, pandas DataFrame, default `None`
        Exogenous variable/s included as predictor/s. Must have the same
        number of observations as `y` and should be aligned so that y[i] is
        regressed on exog[i].

    refit : bool, default `False`
        Whether to re-fit the forecaster in each iteration.

    interval : list, default `None`
        Confidence of the prediction interval estimated. Sequence of percentiles
        to compute, which must be between 0 and 100 inclusive. If `None`, no
        intervals are estimated. Only available for forecaster of type ForecasterAutoreg
        and ForecasterAutoregCustom.

    n_boot : int, default `500`
        Number of bootstrapping iterations used to estimate prediction
        intervals.

    random_state : int, default `123`
        Sets a seed to the random generator, so that boot intervals are always 
        deterministic.

    in_sample_residuals : bool, default `True`
        If `True`, residuals from the training data are used as proxy of
        prediction error to create prediction intervals.  If `False`, out_sample_residuals
        are used if they are already stored inside the forecaster.

    verbose : bool, default `False`
        Print number of folds and index of training and validation sets used 
        for backtesting.

    show_progress: bool, default `True`
        Whether to show a progress bar. Defaults to True.

    Returns 
    -------
    metrics_levels : pandas DataFrame
        Value(s) of the metric(s). Index are the levels and columns the metrics.

    backtest_predictions : pandas DataFrame
        Value of predictions and their estimated interval if `interval` is not `None`.
        If there is more than one level, this structure will be repeated for each of them.
            column pred = predictions.
            column lower_bound = lower bound of the interval.
            column upper_bound = upper bound interval of the interval.

    """

    if type(forecaster).__name__ not in ['ForecasterAutoregMultiSeries', 
                                         'ForecasterAutoregMultiSeriesCustom', 
                                         'ForecasterAutoregMultiVariate']:
        raise TypeError(
            ("`forecaster` must be of type `ForecasterAutoregMultiSeries`, "
             "`ForecasterAutoregMultiSeriesCustom` or `ForecasterAutoregMultiVariate`, "
             "for all other types of forecasters use the functions available in "
             f"the `model_selection` module. Got {type(forecaster).__name__}")
        )

    check_backtesting_input(
        forecaster            = forecaster,
        steps                 = steps,
        metric                = metric,
        series                = series,
        initial_train_size    = initial_train_size,
        fixed_train_size      = fixed_train_size,
        gap                   = gap,
        allow_incomplete_fold = allow_incomplete_fold,
        refit                 = refit,
        interval              = interval,
        n_boot                = n_boot,
        random_state          = random_state,
        in_sample_residuals   = in_sample_residuals,
        verbose               = verbose,
        show_progress         = show_progress
    )

    if type(forecaster).__name__ in ['ForecasterAutoregMultiSeries', 
                                     'ForecasterAutoregMultiSeriesCustom'] \
        and levels is not None and not isinstance(levels, (str, list)):
        raise TypeError(
            ("`levels` must be a `list` of column names, a `str` of a column name "
             "or `None` when using a `ForecasterAutoregMultiSeries` or "
             "`ForecasterAutoregMultiSeriesCustom`. If the forecaster is of type "
             "`ForecasterAutoregMultiVariate`, this argument is ignored.")
        )

    if type(forecaster).__name__ == 'ForecasterAutoregMultiVariate' \
        and levels and levels != forecaster.level and levels != [forecaster.level]:
        warnings.warn(
            (f"`levels` argument have no use when the forecaster is of type "
             f"`ForecasterAutoregMultiVariate`. The level of this forecaster is "
             f"{forecaster.level}, to predict another level, change the `level` "
             f"argument when initializing the forecaster."),
             IgnoredArgumentWarning
        )

    if refit:
        metrics_levels, backtest_predictions = _backtesting_forecaster_multiseries_refit(
            forecaster            = forecaster,
            series                = series,
            steps                 = steps,
            levels                = levels,
            metric                = metric,
            initial_train_size    = initial_train_size,
            fixed_train_size      = fixed_train_size,
            gap                   = gap,
            allow_incomplete_fold = allow_incomplete_fold,
            exog                  = exog,
            interval              = interval,
            n_boot                = n_boot,
            random_state          = random_state,
            in_sample_residuals   = in_sample_residuals,
            verbose               = verbose,
            show_progress         = show_progress
        )
    else:
        metrics_levels, backtest_predictions = _backtesting_forecaster_multiseries_no_refit(
            forecaster            = forecaster,
            series                = series,
            steps                 = steps,
            levels                = levels,
            metric                = metric,
            initial_train_size    = initial_train_size,
            gap                   = gap,
            allow_incomplete_fold = allow_incomplete_fold,
            exog                  = exog,
            interval              = interval,
            n_boot                = n_boot,
            random_state          = random_state,
            in_sample_residuals   = in_sample_residuals,
            verbose               = verbose,
            show_progress         = show_progress
        )

    return metrics_levels, backtest_predictions

`grid_search_forecaster_multiseries(forecaster, series, param_grid, steps, metric, initial_train_size, fixed_train_size=True, gap=0, allow_incomplete_fold=True, levels=None, exog=None, lags_grid=None, refit=False, return_best=True, verbose=True)` ¶

Exhaustive search over specified parameter values for a Forecaster object.

Validation is done using multi-series backtesting.

Parameters:

Name	Type	Description	Default
`forecaster`	`ForecasterAutoregMultiSeries, ForecasterAutoregMultiSeriesCustom, ForecasterAutoregMultiVariate`	Forcaster model.	required
`series`	`DataFrame`	Training time series.	required
`param_grid`	`dict`	Dictionary with parameters names (`str`) as keys and lists of parameter settings to try as values.	required
`steps`	`int`	Number of steps to predict.	required
`metric`	`Union[str, Callable, list]`	Metric used to quantify the goodness of fit of the model. If string: {'mean_squared_error', 'mean_absolute_error', 'mean_absolute_percentage_error', 'mean_squared_log_error'} If Callable: Function with arguments y_true, y_pred that returns a float. If list: List containing multiple strings and/or Callables.	required
`initial_train_size`	`int`	Number of samples in the initial train split.	required
`fixed_train_size`	`bool`	If True, train size doesn't increase but moves by `steps` in each iteration.	`True`
`gap`	`int`	Number of samples to be excluded after the end of each training set and before the test set.	`0`
`allow_incomplete_fold`	`bool`	Last fold is allowed to have a smaller number of samples than the `test_size`. If `False`, the last fold is excluded.	`True`
`levels`	`Union[str, list]`	level (`str`) or levels (`list`) at which the forecaster is optimized. If `None`, all levels are taken into account. The resulting metric will be the average of the optimization of all levels.	`None`
`exog`	`Union[pandas.core.series.Series, pandas.core.frame.DataFrame]`	Exogenous variable/s included as predictor/s. Must have the same number of observations as `y` and should be aligned so that y[i] is regressed on exog[i].	`None`
`lags_grid`	`Optional[list]`	Lists of `lags` to try. Only used if forecaster is an instance of `ForecasterAutoregMultiSeries` or `ForecasterAutoregMultiVariate`.	`None`
`refit`	`bool`	Whether to re-fit the forecaster in each iteration of backtesting.	`False`
`return_best`	`bool`	Refit the `forecaster` using the best found parameters on the whole data.	`True`
`verbose`	`bool`	Print number of folds used for cv or backtesting.	`True`

Returns:

Type	Description
`DataFrame`	Results for each combination of parameters. column levels = levels. column lags = predictions. column params = lower bound of the interval. column metric = metric(s) value(s) estimated for each combination of parameters. The resulting metric will be the average of the optimization of all levels. additional n columns with param = value.

Source code in skforecast/model_selection_multiseries/model_selection_multiseries.py

def grid_search_forecaster_multiseries(
    forecaster,
    series: pd.DataFrame,
    param_grid: dict,
    steps: int,
    metric: Union[str, Callable, list],
    initial_train_size: int,
    fixed_train_size: bool=True,
    gap: int=0,
    allow_incomplete_fold: bool=True,
    levels: Optional[Union[str, list]]=None,
    exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
    lags_grid: Optional[list]=None,
    refit: bool=False,
    return_best: bool=True,
    verbose: bool=True
) -> pd.DataFrame:
    """
    Exhaustive search over specified parameter values for a Forecaster object.
    Validation is done using multi-series backtesting.

    Parameters
    ----------
    forecaster : ForecasterAutoregMultiSeries, ForecasterAutoregMultiSeriesCustom, ForecasterAutoregMultiVariate
        Forcaster model.

    series : pandas DataFrame
        Training time series.

    param_grid : dict
        Dictionary with parameters names (`str`) as keys and lists of parameter
        settings to try as values.

    steps : int
        Number of steps to predict.

    metric : str, Callable, list
        Metric used to quantify the goodness of fit of the model.

        If string:
            {'mean_squared_error', 'mean_absolute_error',
             'mean_absolute_percentage_error', 'mean_squared_log_error'}

        If Callable:
            Function with arguments y_true, y_pred that returns a float.

        If list:
            List containing multiple strings and/or Callables.

    initial_train_size : int 
        Number of samples in the initial train split.

    fixed_train_size : bool, default `True`
        If True, train size doesn't increase but moves by `steps` in each iteration.

    gap : int, default `0`
        Number of samples to be excluded after the end of each training set and 
        before the test set.

    allow_incomplete_fold : bool, default `True`
        Last fold is allowed to have a smaller number of samples than the 
        `test_size`. If `False`, the last fold is excluded.

    levels : str, list, default `None`
        level (`str`) or levels (`list`) at which the forecaster is optimized. 
        If `None`, all levels are taken into account. The resulting metric will be
        the average of the optimization of all levels.

    exog : pandas Series, pandas DataFrame, default `None`
        Exogenous variable/s included as predictor/s. Must have the same
        number of observations as `y` and should be aligned so that y[i] is
        regressed on exog[i].

    lags_grid : list of int, lists, np.narray or range, default `None`
        Lists of `lags` to try. Only used if forecaster is an instance of 
        `ForecasterAutoregMultiSeries` or `ForecasterAutoregMultiVariate`.

    refit : bool, default `False`
        Whether to re-fit the forecaster in each iteration of backtesting.

    return_best : bool, default `True`
        Refit the `forecaster` using the best found parameters on the whole data.

    verbose : bool, default `True`
        Print number of folds used for cv or backtesting.

    Returns 
    -------
    results : pandas DataFrame
        Results for each combination of parameters.
            column levels = levels.
            column lags = predictions.
            column params = lower bound of the interval.
            column metric = metric(s) value(s) estimated for each combination of 
                            parameters. The resulting metric will be the average 
                            of the optimization of all levels.
            additional n columns with param = value.

    """

    param_grid = list(ParameterGrid(param_grid))

    results = _evaluate_grid_hyperparameters_multiseries(
                  forecaster            = forecaster,
                  series                = series,
                  param_grid            = param_grid,
                  steps                 = steps,
                  metric                = metric,
                  initial_train_size    = initial_train_size,
                  fixed_train_size      = fixed_train_size,
                  gap                   = gap,
                  allow_incomplete_fold = allow_incomplete_fold,
                  levels                = levels,
                  exog                  = exog,
                  lags_grid             = lags_grid,
                  refit                 = refit,
                  return_best           = return_best,
                  verbose               = verbose
              )

    return results

`random_search_forecaster_multiseries(forecaster, series, param_distributions, steps, metric, initial_train_size, fixed_train_size=True, gap=0, allow_incomplete_fold=True, levels=None, exog=None, lags_grid=None, refit=False, n_iter=10, random_state=123, return_best=True, verbose=True)` ¶

Random search over specified parameter values or distributions for a Forecaster

object. Validation is done using multi-series backtesting.

Parameters:

Name	Type	Description	Default
`forecaster`	`ForecasterAutoregMultiSeries, ForecasterAutoregMultiSeriesCustom, ForecasterAutoregMultiVariate`	Forcaster model.	required
`series`	`DataFrame`	Training time series.	required
`param_distributions`	`dict`	Dictionary with parameters names (`str`) as keys and distributions or lists of parameters to try.	required
`steps`	`int`	Number of steps to predict.	required
`metric`	`Union[str, Callable, list]`	Metric used to quantify the goodness of fit of the model. If string: {'mean_squared_error', 'mean_absolute_error', 'mean_absolute_percentage_error', 'mean_squared_log_error'} If Callable: Function with arguments y_true, y_pred that returns a float. If list: List containing multiple strings and/or Callables.	required
`initial_train_size`	`int`	Number of samples in the initial train split.	required
`fixed_train_size`	`bool`	If True, train size doesn't increase but moves by `steps` in each iteration.	`True`
`gap`	`int`	Number of samples to be excluded after the end of each training set and before the test set.	`0`
`allow_incomplete_fold`	`bool`	Last fold is allowed to have a smaller number of samples than the `test_size`. If `False`, the last fold is excluded.	`True`
`levels`	`Union[str, list]`	level (`str`) or levels (`list`) at which the forecaster is optimized. If `None`, all levels are taken into account. The resulting metric will be the average of the optimization of all levels.	`None`
`exog`	`Union[pandas.core.series.Series, pandas.core.frame.DataFrame]`	Exogenous variable/s included as predictor/s. Must have the same number of observations as `y` and should be aligned so that y[i] is regressed on exog[i].	`None`
`lags_grid`	`Optional[list]`	Lists of `lags` to try. Only used if forecaster is an instance of `ForecasterAutoregMultiSeries` or `ForecasterAutoregMultiVariate`.	`None`
`refit`	`bool`	Whether to re-fit the forecaster in each iteration of backtesting.	`False`
`n_iter`	`int`	Number of parameter settings that are sampled per lags configuration. n_iter trades off runtime vs quality of the solution.	`10`
`random_state`	`int`	Sets a seed to the random sampling for reproducible output.	`123`
`return_best`	`bool`	Refit the `forecaster` using the best found parameters on the whole data.	`True`
`verbose`	`bool`	Print number of folds used for cv or backtesting.	`True`

Returns:

Type	Description
`DataFrame`	Results for each combination of parameters. column levels = levels. column lags = predictions. column params = lower bound of the interval. column metric = metric(s) value(s) estimated for each combination of parameters. The resulting metric will be the average of the optimization of all levels. additional n columns with param = value.

Source code in skforecast/model_selection_multiseries/model_selection_multiseries.py

def random_search_forecaster_multiseries(
    forecaster,
    series: pd.DataFrame,
    param_distributions: dict,
    steps: int,
    metric: Union[str, Callable, list],
    initial_train_size: int,
    fixed_train_size: bool=True,
    gap: int=0,
    allow_incomplete_fold: bool=True,
    levels: Optional[Union[str, list]]=None,
    exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
    lags_grid: Optional[list]=None,
    refit: bool=False,
    n_iter: int=10,
    random_state: int=123,
    return_best: bool=True,
    verbose: bool=True
) -> pd.DataFrame:
    """
    Random search over specified parameter values or distributions for a Forecaster 
    object. Validation is done using multi-series backtesting.

    Parameters
    ----------
    forecaster : ForecasterAutoregMultiSeries, ForecasterAutoregMultiSeriesCustom, ForecasterAutoregMultiVariate
        Forcaster model.

    series : pandas DataFrame
        Training time series.

    param_distributions : dict
        Dictionary with parameters names (`str`) as keys and 
        distributions or lists of parameters to try.

    steps : int
        Number of steps to predict.

    metric : str, Callable, list
        Metric used to quantify the goodness of fit of the model.

        If string:
            {'mean_squared_error', 'mean_absolute_error',
             'mean_absolute_percentage_error', 'mean_squared_log_error'}

        If Callable:
            Function with arguments y_true, y_pred that returns a float.

        If list:
            List containing multiple strings and/or Callables.

    initial_train_size : int 
        Number of samples in the initial train split.

    fixed_train_size : bool, default `True`
        If True, train size doesn't increase but moves by `steps` in each iteration.

    gap : int, default `0`
        Number of samples to be excluded after the end of each training set and 
        before the test set.

    allow_incomplete_fold : bool, default `True`
        Last fold is allowed to have a smaller number of samples than the 
        `test_size`. If `False`, the last fold is excluded.

    levels : str, list, default `None`
        level (`str`) or levels (`list`) at which the forecaster is optimized. 
        If `None`, all levels are taken into account. The resulting metric will be
        the average of the optimization of all levels.

    exog : pandas Series, pandas DataFrame, default `None`
        Exogenous variable/s included as predictor/s. Must have the same
        number of observations as `y` and should be aligned so that y[i] is
        regressed on exog[i].

    lags_grid : list of int, lists, np.narray or range, default `None`
        Lists of `lags` to try. Only used if forecaster is an instance of 
        `ForecasterAutoregMultiSeries` or `ForecasterAutoregMultiVariate`.

    refit : bool, default `False`
        Whether to re-fit the forecaster in each iteration of backtesting.

    n_iter : int, default `10`
        Number of parameter settings that are sampled per lags configuration. 
        n_iter trades off runtime vs quality of the solution.

    random_state : int, default `123`
        Sets a seed to the random sampling for reproducible output.

    return_best : bool, default `True`
        Refit the `forecaster` using the best found parameters on the whole data.

    verbose : bool, default `True`
        Print number of folds used for cv or backtesting.

    Returns 
    -------
    results : pandas DataFrame
        Results for each combination of parameters.
            column levels = levels.
            column lags = predictions.
            column params = lower bound of the interval.
            column metric = metric(s) value(s) estimated for each combination of 
                            parameters. The resulting metric will be the average 
                            of the optimization of all levels.
            additional n columns with param = value.

    """

    param_grid = list(ParameterSampler(param_distributions, n_iter=n_iter, 
                                       random_state=random_state))

    results = _evaluate_grid_hyperparameters_multiseries(
                  forecaster            = forecaster,
                  series                = series,
                  param_grid            = param_grid,
                  steps                 = steps,
                  metric                = metric,
                  initial_train_size    = initial_train_size,
                  fixed_train_size      = fixed_train_size,
                  gap                   = gap,
                  allow_incomplete_fold = allow_incomplete_fold,
                  levels                = levels,
                  exog                  = exog,
                  lags_grid             = lags_grid,
                  refit                 = refit,
                  return_best           = return_best,
                  verbose               = verbose
              )

    return results

`backtesting_forecaster_multivariate(forecaster, series, steps, metric, initial_train_size, fixed_train_size=True, gap=0, allow_incomplete_fold=True, levels=None, exog=None, refit=False, interval=None, n_boot=500, random_state=123, in_sample_residuals=True, verbose=False, show_progress=True)` ¶

This function is an alias of backtesting_forecaster_multiseries.

Backtesting for multi-series and multivariate forecasters.

If refit is False, the model is trained only once using the initial_train_size first observations. If refit is True, the model is trained in each iteration increasing the training set. A copy of the original forecaster is created so it is not modified during the process.

Parameters:

Name	Type	Description	Default
`forecaster`	`ForecasterAutoregMultiSeries, ForecasterAutoregMultiSeriesCustom, ForecasterAutoregMultiVariate`	Forecaster model.	required
`series`	`DataFrame`	Training time series.	required
`steps`	`int`	Number of steps to predict.	required
`metric`	`Union[str, Callable, list]`	Metric used to quantify the goodness of fit of the model. If string: {'mean_squared_error', 'mean_absolute_error', 'mean_absolute_percentage_error', 'mean_squared_log_error'} If Callable: Function with arguments y_true, y_pred that returns a float. If list: List containing multiple strings and/or Callables.	required
`initial_train_size`	`Optional[int]`	Number of samples in the initial train split. If `None` and `forecaster` is already trained, no initial train is done and all data is used to evaluate the model. However, the first `len(forecaster.last_window)` observations are needed to create the initial predictors, so no predictions are calculated for them. `None` is only allowed when `refit` is `False`.	required
`fixed_train_size`	`bool`	If True, train size doesn't increase but moves by `steps` in each iteration.	`True`
`gap`	`int`	Number of samples to be excluded after the end of each training set and before the test set.	`0`
`allow_incomplete_fold`	`bool`	Last fold is allowed to have a smaller number of samples than the `test_size`. If `False`, the last fold is excluded.	`True`
`levels`	`Union[str, list]`	Time series to be predicted. If `None` all levels will be predicted. New in version 0.6.0	`None`
`exog`	`Union[pandas.core.series.Series, pandas.core.frame.DataFrame]`	Exogenous variable/s included as predictor/s. Must have the same number of observations as `y` and should be aligned so that y[i] is regressed on exog[i].	`None`
`refit`	`bool`	Whether to re-fit the forecaster in each iteration.	`False`
`interval`	`Optional[list]`	Confidence of the prediction interval estimated. Sequence of percentiles to compute, which must be between 0 and 100 inclusive. If `None`, no intervals are estimated. Only available for forecaster of type ForecasterAutoreg and ForecasterAutoregCustom.	`None`
`n_boot`	`int`	Number of bootstrapping iterations used to estimate prediction intervals.	`500`
`random_state`	`int`	Sets a seed to the random generator, so that boot intervals are always deterministic.	`123`
`in_sample_residuals`	`bool`	If `True`, residuals from the training data are used as proxy of prediction error to create prediction intervals. If `False`, out_sample_residuals are used if they are already stored inside the forecaster.	`True`
`verbose`	`bool`	Print number of folds and index of training and validation sets used for backtesting.	`False`
`show_progress`	`bool`	Whether to show a progress bar. Defaults to True.	`True`

Returns:

Type	Description
`Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame]`	Value(s) of the metric(s). Index are the levels and columns the metrics.

Source code in skforecast/model_selection_multiseries/model_selection_multiseries.py

def backtesting_forecaster_multivariate(
    forecaster,
    series: pd.DataFrame,
    steps: int,
    metric: Union[str, Callable, list],
    initial_train_size: Optional[int],
    fixed_train_size: bool=True,
    gap: int=0,
    allow_incomplete_fold: bool=True,
    levels: Optional[Union[str, list]]=None,
    exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
    refit: bool=False,
    interval: Optional[list]=None,
    n_boot: int=500,
    random_state: int=123,
    in_sample_residuals: bool=True,
    verbose: bool=False,
    show_progress: bool=True
) -> Tuple[pd.DataFrame, pd.DataFrame]:
    """
    This function is an alias of backtesting_forecaster_multiseries.

    Backtesting for multi-series and multivariate forecasters.

    If `refit` is False, the model is trained only once using the `initial_train_size`
    first observations. If `refit` is True, the model is trained in each iteration
    increasing the training set. A copy of the original forecaster is created so 
    it is not modified during the process.

    Parameters
    ----------
    forecaster : ForecasterAutoregMultiSeries, ForecasterAutoregMultiSeriesCustom, ForecasterAutoregMultiVariate
        Forecaster model.

    series : pandas DataFrame
        Training time series.

    steps : int
        Number of steps to predict.

    metric : str, Callable, list
        Metric used to quantify the goodness of fit of the model.

        If string:
            {'mean_squared_error', 'mean_absolute_error',
             'mean_absolute_percentage_error', 'mean_squared_log_error'}

        If Callable:
            Function with arguments y_true, y_pred that returns a float.

        If list:
            List containing multiple strings and/or Callables.

    initial_train_size : int, default `None`
        Number of samples in the initial train split. If `None` and `forecaster` is already 
        trained, no initial train is done and all data is used to evaluate the model. However, 
        the first `len(forecaster.last_window)` observations are needed to create the 
        initial predictors, so no predictions are calculated for them.

        `None` is only allowed when `refit` is `False`.

    fixed_train_size : bool, default `True`
        If True, train size doesn't increase but moves by `steps` in each iteration.

    gap : int, default `0`
        Number of samples to be excluded after the end of each training set and 
        before the test set.

    allow_incomplete_fold : bool, default `True`
        Last fold is allowed to have a smaller number of samples than the 
        `test_size`. If `False`, the last fold is excluded.

    levels : str, list, default `None`
        Time series to be predicted. If `None` all levels will be predicted.
        **New in version 0.6.0**

    exog : pandas Series, pandas DataFrame, default `None`
        Exogenous variable/s included as predictor/s. Must have the same
        number of observations as `y` and should be aligned so that y[i] is
        regressed on exog[i].

    refit : bool, default `False`
        Whether to re-fit the forecaster in each iteration.

    interval : list, default `None`
        Confidence of the prediction interval estimated. Sequence of percentiles
        to compute, which must be between 0 and 100 inclusive. If `None`, no
        intervals are estimated. Only available for forecaster of type ForecasterAutoreg
        and ForecasterAutoregCustom.

    n_boot : int, default `500`
        Number of bootstrapping iterations used to estimate prediction
        intervals.

    random_state : int, default `123`
        Sets a seed to the random generator, so that boot intervals are always 
        deterministic.

    in_sample_residuals : bool, default `True`
        If `True`, residuals from the training data are used as proxy of
        prediction error to create prediction intervals.  If `False`, out_sample_residuals
        are used if they are already stored inside the forecaster.

    verbose : bool, default `False`
        Print number of folds and index of training and validation sets used for backtesting.

    show_progress: bool, default `True`
        Whether to show a progress bar. Defaults to True.

    Returns 
    -------
    metrics_levels : pandas DataFrame
        Value(s) of the metric(s). Index are the levels and columns the metrics.

    backtest_predictions : pandas DataFrame
        Value of predictions and their estimated interval if `interval` is not `None`.
        If there is more than one level, this structure will be repeated for each of them.
            column pred = predictions.
            column lower_bound = lower bound of the interval.
            column upper_bound = upper bound interval of the interval.

    """

    metrics_levels, backtest_predictions = backtesting_forecaster_multiseries(
        forecaster            = forecaster,
        series                = series,
        steps                 = steps,
        metric                = metric,
        initial_train_size    = initial_train_size,
        fixed_train_size      = fixed_train_size,
        gap                   = gap,
        allow_incomplete_fold = allow_incomplete_fold,
        levels                = levels,
        exog                  = exog,
        refit                 = refit,
        interval              = interval,
        n_boot                = n_boot,
        random_state          = random_state,
        in_sample_residuals   = in_sample_residuals,
        verbose               = verbose,
        show_progress         = show_progress

    )

    return metrics_levels, backtest_predictions

`grid_search_forecaster_multivariate(forecaster, series, param_grid, steps, metric, initial_train_size, fixed_train_size=True, gap=0, allow_incomplete_fold=True, levels=None, exog=None, lags_grid=None, refit=False, return_best=True, verbose=True)` ¶

This function is an alias of grid_search_forecaster_multiseries.

Exhaustive search over specified parameter values for a Forecaster object. Validation is done using multi-series backtesting.

Parameters:

Name	Type	Description	Default
`forecaster`	`ForecasterAutoregMultiSeries, ForecasterAutoregMultiSeriesCustom, ForecasterAutoregMultiVariate`	Forcaster model.	required
`series`	`DataFrame`	Training time series.	required
`param_grid`	`dict`	Dictionary with parameters names (`str`) as keys and lists of parameter settings to try as values.	required
`steps`	`int`	Number of steps to predict.	required
`metric`	`Union[str, Callable, list]`	Metric used to quantify the goodness of fit of the model. If string: {'mean_squared_error', 'mean_absolute_error', 'mean_absolute_percentage_error', 'mean_squared_log_error'} If Callable: Function with arguments y_true, y_pred that returns a float. If list: List containing multiple strings and/or Callables.	required
`initial_train_size`	`int`	Number of samples in the initial train split.	required
`fixed_train_size`	`bool`	If True, train size doesn't increase but moves by `steps` in each iteration.	`True`
`gap`	`int`	Number of samples to be excluded after the end of each training set and before the test set.	`0`
`allow_incomplete_fold`	`bool`	Last fold is allowed to have a smaller number of samples than the `test_size`. If `False`, the last fold is excluded.	`True`
`levels`	`Union[str, list]`	level (`str`) or levels (`list`) at which the forecaster is optimized. If `None`, all levels are taken into account. The resulting metric will be the average of the optimization of all levels.	`None`
`exog`	`Union[pandas.core.series.Series, pandas.core.frame.DataFrame]`	Exogenous variable/s included as predictor/s. Must have the same number of observations as `y` and should be aligned so that y[i] is regressed on exog[i].	`None`
`lags_grid`	`Optional[list]`	Lists of `lags` to try. Only used if forecaster is an instance of `ForecasterAutoregMultiSeries` or `ForecasterAutoregMultiVariate`.	`None`
`refit`	`bool`	Whether to re-fit the forecaster in each iteration of backtesting.	`False`
`return_best`	`bool`	Refit the `forecaster` using the best found parameters on the whole data.	`True`
`verbose`	`bool`	Print number of folds used for cv or backtesting.	`True`

Returns:

Type	Description
`DataFrame`	Results for each combination of parameters. column levels = levels. column lags = predictions. column params = lower bound of the interval. column metric = metric(s) value(s) estimated for each combination of parameters. The resulting metric will be the average of the optimization of all levels. additional n columns with param = value.

Source code in skforecast/model_selection_multiseries/model_selection_multiseries.py

def grid_search_forecaster_multivariate(
    forecaster,
    series: pd.DataFrame,
    param_grid: dict,
    steps: int,
    metric: Union[str, Callable, list],
    initial_train_size: int,
    fixed_train_size: bool=True,
    gap: int=0,
    allow_incomplete_fold: bool=True,
    levels: Optional[Union[str, list]]=None,
    exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
    lags_grid: Optional[list]=None,
    refit: bool=False,
    return_best: bool=True,
    verbose: bool=True
) -> pd.DataFrame:
    """
    This function is an alias of grid_search_forecaster_multiseries.

    Exhaustive search over specified parameter values for a Forecaster object.
    Validation is done using multi-series backtesting.

    Parameters
    ----------
    forecaster : ForecasterAutoregMultiSeries, ForecasterAutoregMultiSeriesCustom, ForecasterAutoregMultiVariate
        Forcaster model.

    series : pandas DataFrame
        Training time series.

    param_grid : dict
        Dictionary with parameters names (`str`) as keys and lists of parameter
        settings to try as values.

    steps : int
        Number of steps to predict.

    metric : str, Callable, list
        Metric used to quantify the goodness of fit of the model.

        If string:
            {'mean_squared_error', 'mean_absolute_error',
             'mean_absolute_percentage_error', 'mean_squared_log_error'}

        If Callable:
            Function with arguments y_true, y_pred that returns a float.

        If list:
            List containing multiple strings and/or Callables.

    initial_train_size : int 
        Number of samples in the initial train split.

    fixed_train_size : bool, default `True`
        If True, train size doesn't increase but moves by `steps` in each iteration.

    gap : int, default `0`
        Number of samples to be excluded after the end of each training set and 
        before the test set.

    allow_incomplete_fold : bool, default `True`
        Last fold is allowed to have a smaller number of samples than the 
        `test_size`. If `False`, the last fold is excluded.

    levels : str, list, default `None`
        level (`str`) or levels (`list`) at which the forecaster is optimized. 
        If `None`, all levels are taken into account. The resulting metric will be
        the average of the optimization of all levels.

    exog : pandas Series, pandas DataFrame, default `None`
        Exogenous variable/s included as predictor/s. Must have the same
        number of observations as `y` and should be aligned so that y[i] is
        regressed on exog[i].

    lags_grid : list of int, lists, np.narray or range, default `None`
        Lists of `lags` to try. Only used if forecaster is an instance of 
        `ForecasterAutoregMultiSeries` or `ForecasterAutoregMultiVariate`.

    refit : bool, default `False`
        Whether to re-fit the forecaster in each iteration of backtesting.

    return_best : bool, default `True`
        Refit the `forecaster` using the best found parameters on the whole data.

    verbose : bool, default `True`
        Print number of folds used for cv or backtesting.

    Returns 
    -------
    results : pandas DataFrame
        Results for each combination of parameters.
            column levels = levels.
            column lags = predictions.
            column params = lower bound of the interval.
            column metric = metric(s) value(s) estimated for each combination of 
                            parameters. The resulting metric will be the average 
                            of the optimization of all levels.
            additional n columns with param = value.

    """

    results = grid_search_forecaster_multiseries(
        forecaster            = forecaster,
        series                = series,
        param_grid            = param_grid,
        steps                 = steps,
        metric                = metric,
        initial_train_size    = initial_train_size,
        fixed_train_size      = fixed_train_size,
        gap                   = gap,
        allow_incomplete_fold = allow_incomplete_fold,
        levels                = levels,
        exog                  = exog,
        lags_grid             = lags_grid,
        refit                 = refit,
        return_best           = return_best,
        verbose               = verbose
    )

    return results

`random_search_forecaster_multivariate(forecaster, series, param_distributions, steps, metric, initial_train_size, fixed_train_size=True, gap=0, allow_incomplete_fold=True, levels=None, exog=None, lags_grid=None, refit=False, n_iter=10, random_state=123, return_best=True, verbose=True)` ¶

This function is an alias of random_search_forecaster_multiseries.

Random search over specified parameter values or distributions for a Forecaster object. Validation is done using multi-series backtesting.

Parameters:

Name	Type	Description	Default
`forecaster`	`ForecasterAutoregMultiSeries, ForecasterAutoregMultiSeriesCustom, ForecasterAutoregMultiVariate`	Forcaster model.	required
`series`	`DataFrame`	Training time series.	required
`param_distributions`	`dict`	Dictionary with parameters names (`str`) as keys and distributions or lists of parameters to try.	required
`steps`	`int`	Number of steps to predict.	required
`metric`	`Union[str, Callable, list]`	Metric used to quantify the goodness of fit of the model. If string: {'mean_squared_error', 'mean_absolute_error', 'mean_absolute_percentage_error', 'mean_squared_log_error'} If Callable: Function with arguments y_true, y_pred that returns a float. If list: List containing multiple strings and/or Callables.	required
`initial_train_size`	`int`	Number of samples in the initial train split.	required
`fixed_train_size`	`bool`	If True, train size doesn't increase but moves by `steps` in each iteration.	`True`
`gap`	`int`	Number of samples to be excluded after the end of each training set and before the test set.	`0`
`allow_incomplete_fold`	`bool`	Last fold is allowed to have a smaller number of samples than the `test_size`. If `False`, the last fold is excluded.	`True`
`levels`	`Union[str, list]`	level (`str`) or levels (`list`) at which the forecaster is optimized. If `None`, all levels are taken into account. The resulting metric will be the average of the optimization of all levels.	`None`
`exog`	`Union[pandas.core.series.Series, pandas.core.frame.DataFrame]`	Exogenous variable/s included as predictor/s. Must have the same number of observations as `y` and should be aligned so that y[i] is regressed on exog[i].	`None`
`lags_grid`	`Optional[list]`	Lists of `lags` to try. Only used if forecaster is an instance of `ForecasterAutoregMultiSeries` or `ForecasterAutoregMultiVariate`.	`None`
`refit`	`bool`	Whether to re-fit the forecaster in each iteration of backtesting.	`False`
`n_iter`	`int`	Number of parameter settings that are sampled per lags configuration. n_iter trades off runtime vs quality of the solution.	`10`
`random_state`	`int`	Sets a seed to the random sampling for reproducible output.	`123`
`return_best`	`bool`	Refit the `forecaster` using the best found parameters on the whole data.	`True`
`verbose`	`bool`	Print number of folds used for cv or backtesting.	`True`

Returns:

Type	Description
`DataFrame`	Results for each combination of parameters. column levels = levels. column lags = predictions. column params = lower bound of the interval. column metric = metric(s) value(s) estimated for each combination of parameters. The resulting metric will be the average of the optimization of all levels. additional n columns with param = value.

Source code in skforecast/model_selection_multiseries/model_selection_multiseries.py

def random_search_forecaster_multivariate(
    forecaster,
    series: pd.DataFrame,
    param_distributions: dict,
    steps: int,
    metric: Union[str, Callable, list],
    initial_train_size: int,
    fixed_train_size: bool=True,
    gap: int=0,
    allow_incomplete_fold: bool=True,
    levels: Optional[Union[str, list]]=None,
    exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
    lags_grid: Optional[list]=None,
    refit: bool=False,
    n_iter: int=10,
    random_state: int=123,
    return_best: bool=True,
    verbose: bool=True
) -> pd.DataFrame:
    """
    This function is an alias of random_search_forecaster_multiseries.

    Random search over specified parameter values or distributions for a Forecaster 
    object. Validation is done using multi-series backtesting.

    Parameters
    ----------
    forecaster : ForecasterAutoregMultiSeries, ForecasterAutoregMultiSeriesCustom, ForecasterAutoregMultiVariate
        Forcaster model.

    series : pandas DataFrame
        Training time series.

    param_distributions : dict
        Dictionary with parameters names (`str`) as keys and 
        distributions or lists of parameters to try.

    steps : int
        Number of steps to predict.

    metric : str, Callable, list
        Metric used to quantify the goodness of fit of the model.

        If string:
            {'mean_squared_error', 'mean_absolute_error',
             'mean_absolute_percentage_error', 'mean_squared_log_error'}

        If Callable:
            Function with arguments y_true, y_pred that returns a float.

        If list:
            List containing multiple strings and/or Callables.

    initial_train_size : int 
        Number of samples in the initial train split.

    fixed_train_size : bool, default `True`
        If True, train size doesn't increase but moves by `steps` in each iteration.

    gap : int, default `0`
        Number of samples to be excluded after the end of each training set and 
        before the test set.

    allow_incomplete_fold : bool, default `True`
        Last fold is allowed to have a smaller number of samples than the 
        `test_size`. If `False`, the last fold is excluded.

    levels : str, list, default `None`
        level (`str`) or levels (`list`) at which the forecaster is optimized. 
        If `None`, all levels are taken into account. The resulting metric will be
        the average of the optimization of all levels.

    exog : pandas Series, pandas DataFrame, default `None`
        Exogenous variable/s included as predictor/s. Must have the same
        number of observations as `y` and should be aligned so that y[i] is
        regressed on exog[i].

    lags_grid : list of int, lists, np.narray or range, default `None`
        Lists of `lags` to try. Only used if forecaster is an instance of 
        `ForecasterAutoregMultiSeries` or `ForecasterAutoregMultiVariate`.

    refit : bool, default `False`
        Whether to re-fit the forecaster in each iteration of backtesting.

    n_iter : int, default `10`
        Number of parameter settings that are sampled per lags configuration. 
        n_iter trades off runtime vs quality of the solution.

    random_state : int, default `123`
        Sets a seed to the random sampling for reproducible output.

    return_best : bool, default `True`
        Refit the `forecaster` using the best found parameters on the whole data.

    verbose : bool, default `True`
        Print number of folds used for cv or backtesting.

    Returns 
    -------
    results : pandas DataFrame
        Results for each combination of parameters.
            column levels = levels.
            column lags = predictions.
            column params = lower bound of the interval.
            column metric = metric(s) value(s) estimated for each combination of 
                            parameters. The resulting metric will be the average 
                            of the optimization of all levels.
            additional n columns with param = value.

    """

    results = random_search_forecaster_multiseries(
        forecaster            = forecaster,
        series                = series,
        param_distributions   = param_distributions,
        steps                 = steps,
        metric                = metric,
        initial_train_size    = initial_train_size,
        fixed_train_size      = fixed_train_size,
        gap                   = gap,
        allow_incomplete_fold = allow_incomplete_fold,
        levels                = levels,
        exog                  = exog,
        lags_grid             = lags_grid,
        refit                 = refit,
        n_iter                = n_iter,
        random_state          = random_state,
        return_best           = return_best,
        verbose               = verbose
    ) 

    return results

model_selection_multiseries¶

backtesting_forecaster_multiseries(forecaster, series, steps, metric, initial_train_size, fixed_train_size=True, gap=0, allow_incomplete_fold=True, levels=None, exog=None, refit=False, interval=None, n_boot=500, random_state=123, in_sample_residuals=True, verbose=False, show_progress=True) ¶

grid_search_forecaster_multiseries(forecaster, series, param_grid, steps, metric, initial_train_size, fixed_train_size=True, gap=0, allow_incomplete_fold=True, levels=None, exog=None, lags_grid=None, refit=False, return_best=True, verbose=True) ¶

random_search_forecaster_multiseries(forecaster, series, param_distributions, steps, metric, initial_train_size, fixed_train_size=True, gap=0, allow_incomplete_fold=True, levels=None, exog=None, lags_grid=None, refit=False, n_iter=10, random_state=123, return_best=True, verbose=True) ¶

backtesting_forecaster_multivariate(forecaster, series, steps, metric, initial_train_size, fixed_train_size=True, gap=0, allow_incomplete_fold=True, levels=None, exog=None, refit=False, interval=None, n_boot=500, random_state=123, in_sample_residuals=True, verbose=False, show_progress=True) ¶

grid_search_forecaster_multivariate(forecaster, series, param_grid, steps, metric, initial_train_size, fixed_train_size=True, gap=0, allow_incomplete_fold=True, levels=None, exog=None, lags_grid=None, refit=False, return_best=True, verbose=True) ¶

random_search_forecaster_multivariate(forecaster, series, param_distributions, steps, metric, initial_train_size, fixed_train_size=True, gap=0, allow_incomplete_fold=True, levels=None, exog=None, lags_grid=None, refit=False, n_iter=10, random_state=123, return_best=True, verbose=True) ¶

`model_selection_multiseries`¶

`backtesting_forecaster_multiseries(forecaster, series, steps, metric, initial_train_size, fixed_train_size=True, gap=0, allow_incomplete_fold=True, levels=None, exog=None, refit=False, interval=None, n_boot=500, random_state=123, in_sample_residuals=True, verbose=False, show_progress=True)` ¶

`grid_search_forecaster_multiseries(forecaster, series, param_grid, steps, metric, initial_train_size, fixed_train_size=True, gap=0, allow_incomplete_fold=True, levels=None, exog=None, lags_grid=None, refit=False, return_best=True, verbose=True)` ¶

`random_search_forecaster_multiseries(forecaster, series, param_distributions, steps, metric, initial_train_size, fixed_train_size=True, gap=0, allow_incomplete_fold=True, levels=None, exog=None, lags_grid=None, refit=False, n_iter=10, random_state=123, return_best=True, verbose=True)` ¶

`backtesting_forecaster_multivariate(forecaster, series, steps, metric, initial_train_size, fixed_train_size=True, gap=0, allow_incomplete_fold=True, levels=None, exog=None, refit=False, interval=None, n_boot=500, random_state=123, in_sample_residuals=True, verbose=False, show_progress=True)` ¶

`grid_search_forecaster_multivariate(forecaster, series, param_grid, steps, metric, initial_train_size, fixed_train_size=True, gap=0, allow_incomplete_fold=True, levels=None, exog=None, lags_grid=None, refit=False, return_best=True, verbose=True)` ¶

`random_search_forecaster_multivariate(forecaster, series, param_distributions, steps, metric, initial_train_size, fixed_train_size=True, gap=0, allow_incomplete_fold=True, levels=None, exog=None, lags_grid=None, refit=False, n_iter=10, random_state=123, return_best=True, verbose=True)` ¶