`ForecasterAutoregMultiSeries`¶

`ForecasterAutoregMultiSeries (ForecasterBase)` ¶

This class turns any regressor compatible with the scikit-learn API into a

recursive autoregressive (multi-step) forecaster for multiple series.

Parameters:

Name	Type	Description	Default
`regressor`	`object`	An instance of a regressor or pipeline compatible with the scikit-learn API.	required
`lags`	`Union[int, numpy.ndarray, list]`	Lags used as predictors. Index starts at 1, so lag 1 is equal to t-1. `int`: include lags from 1 to `lags` (included). `list`, `numpy ndarray` or `range`: include only lags present in `lags`, all elements must be int.	required
`transformer_series`	`Union[object, dict]`	An instance of a transformer (preprocessor) compatible with the scikit-learn preprocessing API with methods: fit, transform, fit_transform and inverse_transform. If a single transformer is passed, it is cloned and applied to all series. If a dict, a different transformer can be used for each series. Transformation is applied to each `series` before training the forecaster. ColumnTransformers are not allowed since they do not have inverse_transform method.	`None`
`transformer_exog`	`Optional[object]`	An instance of a transformer (preprocessor) compatible with the scikit-learn preprocessing API. The transformation is applied to `exog` before training the forecaster. `inverse_transform` is not available when using ColumnTransformers.	`None`
`weight_func`	`Union[Callable, dict]`	Function that defines the individual weights for each sample based on the index. For example, a function that assigns a lower weight to certain dates. If dict {'series_column_name' : Callable} a different function can be used for each series, a weight of 1 is given to all series not present in `weight_func`. Ignored if `regressor` does not have the argument `sample_weight` in its `fit` method. See Notes section for more details on the use of the weights. New in version 0.6.0	`None`
`series_weights`	`Optional[dict]`	Weights associated with each series {'series_column_name' : float}. It is only applied if the `regressor` used accepts `sample_weight` in its `fit` method. If `series_weights` is provided, a weight of 1 is given to all series not present in `series_weights`. If `None`, all levels have the same weight. See Notes section for more details on the use of the weights. New in version 0.6.0	`None`
`fit_kwargs`	`Optional[dict]`	Additional arguments to be passed to the `fit` method of the regressor. New in version 0.8.0	`None`
`forecaster_id`	`Union[str, int]`	Name used as an identifier of the forecaster. New in version 0.7.0	`None`

Attributes:

Name	Type	Description
`regressor`	`regressor or pipeline compatible with the scikit-learn API`	An instance of a regressor or pipeline compatible with the scikit-learn API.
`lags`	`numpy ndarray`	Lags used as predictors.
`transformer_series`	transformer (preprocessor) or dict of transformers, default `None`	An instance of a transformer (preprocessor) compatible with the scikit-learn preprocessing API with methods: fit, transform, fit_transform and inverse_transform. If a single transformer is passed, it is cloned and applied to all series. If a dict, a different transformer can be used for each series. Transformation is applied to each `series` before training the forecaster. ColumnTransformers are not allowed since they do not have inverse_transform method.
`transformer_series_`	`dict`	Dictionary with the transformer for each series. It is created cloning the objects in `transformer_series` and is used internally to avoid overwriting.
`transformer_exog`	transformer (preprocessor), default `None`	An instance of a transformer (preprocessor) compatible with the scikit-learn preprocessing API. The transformation is applied to `exog` before training the forecaster. `inverse_transform` is not available when using ColumnTransformers.
`weight_func`	Callable, dict, default `None`	Function that defines the individual weights of each sample based on the index. For example, a function that assigns a lower weight to certain dates. If dict {'series_column_name': Callable} a different function can be used for each series, a weight of 1 is given to all series not present in `weight_func`. Ignored if `regressor` does not have the argument `sample_weight` in its `fit` method. See Notes section for more details on the use of the weights. New in version 0.6.0
`weight_func_`	`dict`	Dictionary with the `weight_func` for each series. It is created cloning the objects in `weight_func` and is used internally to avoid overwriting. New in version 0.6.0
`source_code_weight_func`	`str, dict`	Source code of the custom function(s) used to create weights. New in version 0.6.0
`series_weights`	dict, default `None`	Weights associated with each series {'series_column_name': float}. It is only applied if the `regressor` used accepts `sample_weight` in its `fit` method. If `series_weights` is provided, a weight of 1 is given to all series not present in `series_weights`. If `None`, all levels have the same weight. See Notes section for more details on the use of the weights. New in version 0.6.0
`series_weights_`	`dict`	Weights associated with each series.It is created as a clone of `series_weights` and is used internally to avoid overwriting. New in version 0.6.0
`max_lag`	`int`	Maximum value of lag included in `lags`.
`window_size`	`int`	Size of the window needed to create the predictors. It is equal to `max_lag`.
`last_window`	`pandas Series`	Last window seen by the forecaster during training. It stores the values needed to predict the next `step` immediately after the training data.
`index_type`	`type`	Type of index of the input used in training.
`index_freq`	`str`	Frequency of Index of the input used in training.
`index_values`	`pandas Index`	Values of Index of the input used in training.
`training_range`	`pandas Index`	First and last values of index of the data used during training.
`included_exog`	`bool`	If the forecaster has been trained using exogenous variable/s.
`exog_type`	`type`	Type of exogenous variable/s used in training.
`exog_dtypes`	`dict`	Type of each exogenous variable/s used in training. If `transformer_exog` is used, the dtypes are calculated after the transformation.
`exog_col_names`	`list`	Names of columns of `exog` if `exog` used in training was a pandas DataFrame.
`series_col_names`	`list`	Names of the series (levels) used during training.
`X_train_col_names`	`list`	Names of columns of the matrix created internally for training.
`fit_kwargs`	`dict`	Additional arguments to be passed to the `fit` method of the regressor. New in version 0.8.0
`in_sample_residuals`	`dict`	Residuals of the model when predicting training data. Only stored up to 1000 values in the form `{level: residuals}`. If `transformer_series` is not `None`, residuals are stored in the transformed scale.
`out_sample_residuals`	`dict`	Residuals of the models when predicting non training data. Only stored up to 1000 values in the form `{level: residuals}`. If `transformer_series` is not `None`, residuals are assumed to be in the transformed scale. Use `set_out_sample_residuals()` method to set values.
`fitted`	`bool`	Tag to identify if the regressor has been fitted (trained).
`creation_date`	`str`	Date of creation.
`fit_date`	`str`	Date of last fit.
`skforcast_version`	`str`	Version of skforecast library used to create the forecaster.
`python_version`	`str`	Version of python used to create the forecaster.
`forecaster_id`	str, int default `None`	Name used as an identifier of the forecaster. New in version 0.7.0

Source code in skforecast/ForecasterAutoregMultiSeries/ForecasterAutoregMultiSeries.py

class ForecasterAutoregMultiSeries(ForecasterBase):
    """
    This class turns any regressor compatible with the scikit-learn API into a
    recursive autoregressive (multi-step) forecaster for multiple series.

    Parameters
    ----------
    regressor : regressor or pipeline compatible with the scikit-learn API
        An instance of a regressor or pipeline compatible with the scikit-learn API.

    lags : int, list, numpy ndarray, range
        Lags used as predictors. Index starts at 1, so lag 1 is equal to t-1.
            `int`: include lags from 1 to `lags` (included).
            `list`, `numpy ndarray` or `range`: include only lags present in `lags`,
            all elements must be int.

    transformer_series : transformer (preprocessor) or dict of transformers, default `None`
        An instance of a transformer (preprocessor) compatible with the scikit-learn
        preprocessing API with methods: fit, transform, fit_transform and inverse_transform.
        If a single transformer is passed, it is cloned and applied to all series. If a
        dict, a different transformer can be used for each series. Transformation is
        applied to each `series` before training the forecaster.
        ColumnTransformers are not allowed since they do not have inverse_transform method.

    transformer_exog : transformer, default `None`
        An instance of a transformer (preprocessor) compatible with the scikit-learn
        preprocessing API. The transformation is applied to `exog` before training the
        forecaster. `inverse_transform` is not available when using ColumnTransformers.

    weight_func : Callable, dict, default `None`
        Function that defines the individual weights for each sample based on the
        index. For example, a function that assigns a lower weight to certain dates.
        If dict {'series_column_name' : Callable} a different function can be
        used for each series, a weight of 1 is given to all series not present
        in `weight_func`. Ignored if `regressor` does not have the argument 
        `sample_weight` in its `fit` method. See Notes section for more details 
        on the use of the weights.
        **New in version 0.6.0**

    series_weights : dict, default `None`
        Weights associated with each series {'series_column_name' : float}. It is only
        applied if the `regressor` used accepts `sample_weight` in its `fit` method.
        If `series_weights` is provided, a weight of 1 is given to all series not present
        in `series_weights`. If `None`, all levels have the same weight. See Notes section
        for more details on the use of the weights.
        **New in version 0.6.0**

    fit_kwargs : dict, default `None`
        Additional arguments to be passed to the `fit` method of the regressor.
        **New in version 0.8.0**

    forecaster_id : str, int, default `None`
        Name used as an identifier of the forecaster.
        **New in version 0.7.0**


    Attributes
    ----------
    regressor : regressor or pipeline compatible with the scikit-learn API
        An instance of a regressor or pipeline compatible with the scikit-learn API.

    lags : numpy ndarray
        Lags used as predictors.

    transformer_series : transformer (preprocessor) or dict of transformers, default `None`
        An instance of a transformer (preprocessor) compatible with the scikit-learn
        preprocessing API with methods: fit, transform, fit_transform and inverse_transform.
        If a single transformer is passed, it is cloned and applied to all series. If a
        dict, a different transformer can be used for each series. Transformation is
        applied to each `series` before training the forecaster.
        ColumnTransformers are not allowed since they do not have inverse_transform method.

    transformer_series_ : dict
        Dictionary with the transformer for each series. It is created cloning the objects
        in `transformer_series` and is used internally to avoid overwriting.

    transformer_exog : transformer (preprocessor), default `None`
        An instance of a transformer (preprocessor) compatible with the scikit-learn
        preprocessing API. The transformation is applied to `exog` before training the
        forecaster. `inverse_transform` is not available when using ColumnTransformers.

    weight_func : Callable, dict, default `None`
        Function that defines the individual weights of each sample based on the
        index. For example, a function that assigns a lower weight to certain dates.
        If dict {'series_column_name': Callable} a different function can be
        used for each series, a weight of 1 is given to all series not present
        in `weight_func`. Ignored if `regressor` does not have the argument 
        `sample_weight` in its `fit` method. See Notes section for more details 
        on the use of the weights.
        **New in version 0.6.0**

    weight_func_ : dict
        Dictionary with the `weight_func` for each series. It is created cloning the objects
        in `weight_func` and is used internally to avoid overwriting.
        **New in version 0.6.0**

    source_code_weight_func : str, dict
        Source code of the custom function(s) used to create weights.
        **New in version 0.6.0**

    series_weights : dict, default `None`
        Weights associated with each series {'series_column_name': float}. It is only
        applied if the `regressor` used accepts `sample_weight` in its `fit` method.
        If `series_weights` is provided, a weight of 1 is given to all series not present
        in `series_weights`. If `None`, all levels have the same weight. See Notes section
        for more details on the use of the weights.
        **New in version 0.6.0**

    series_weights_ : dict
        Weights associated with each series.It is created as a clone of `series_weights`
        and is used internally to avoid overwriting.
        **New in version 0.6.0**

    max_lag : int
        Maximum value of lag included in `lags`.

    window_size : int
        Size of the window needed to create the predictors. It is equal to
        `max_lag`.

    last_window : pandas Series
        Last window seen by the forecaster during training. It stores the values 
        needed to predict the next `step` immediately after the training data.

    index_type : type
        Type of index of the input used in training.

    index_freq : str
        Frequency of Index of the input used in training.

    index_values : pandas Index
        Values of Index of the input used in training.

    training_range: pandas Index
        First and last values of index of the data used during training.

    included_exog : bool
        If the forecaster has been trained using exogenous variable/s.

    exog_type : type
        Type of exogenous variable/s used in training.

    exog_dtypes : dict
        Type of each exogenous variable/s used in training. If `transformer_exog` 
        is used, the dtypes are calculated after the transformation.

    exog_col_names : list
        Names of columns of `exog` if `exog` used in training was a pandas
        DataFrame.

    series_col_names : list
        Names of the series (levels) used during training.

    X_train_col_names : list
        Names of columns of the matrix created internally for training.

    fit_kwargs : dict
        Additional arguments to be passed to the `fit` method of the regressor.
        **New in version 0.8.0**

    in_sample_residuals : dict
        Residuals of the model when predicting training data. Only stored up to
        1000 values in the form `{level: residuals}`. If `transformer_series` 
        is not `None`, residuals are stored in the transformed scale.

    out_sample_residuals : dict
        Residuals of the models when predicting non training data. Only stored
        up to 1000 values in the form `{level: residuals}`. If `transformer_series` 
        is not `None`, residuals are assumed to be in the transformed scale. Use 
        `set_out_sample_residuals()` method to set values.

    fitted : bool
        Tag to identify if the regressor has been fitted (trained).

    creation_date : str
        Date of creation.

    fit_date : str
        Date of last fit.

    skforcast_version : str
        Version of skforecast library used to create the forecaster.

    python_version : str
        Version of python used to create the forecaster.

    forecaster_id : str, int default `None`
        Name used as an identifier of the forecaster.
        **New in version 0.7.0**


    Notes
    -----

    The weights are used to control the influence that each observation has on the
    training of the model. `ForecasterAutoregMultiseries` accepts two types of weights:

    + series_weights : controls the relative importance of each series. If a series has
    twice as much weight as the others, the observations of that series influence the
    training twice as much. The higher the weight of a series relative to the others,
    the more the model will focus on trying to learn that series.

    + weight_func : controls the relative importance of each observation according to its
    index value. For example, a function that assigns a lower weight to certain dates.

    If the two types of weights are indicated, they are multiplied to create the final
    weights. The resulting `sample_weight` cannot have negative values.

    """

    def __init__(
        self,
        regressor: object,
        lags: Union[int, np.ndarray, list],
        transformer_series: Optional[Union[object, dict]]=None,
        transformer_exog: Optional[object]=None,
        weight_func: Optional[Union[Callable, dict]]=None,
        series_weights: Optional[dict]=None,
        fit_kwargs: Optional[dict]=None,
        forecaster_id: Optional[Union[str, int]]=None
    ) -> None:

        self.regressor               = regressor
        self.transformer_series      = transformer_series
        self.transformer_series_     = None
        self.transformer_exog        = transformer_exog
        self.weight_func             = weight_func
        self.weight_func_            = None
        self.source_code_weight_func = None
        self.series_weights          = series_weights
        self.series_weights_         = None
        self.index_type              = None
        self.index_freq              = None
        self.index_values            = None
        self.training_range          = None
        self.last_window             = None
        self.included_exog           = False
        self.exog_type               = None
        self.exog_dtypes             = None
        self.exog_col_names          = None
        self.series_col_names        = None
        self.X_train_col_names       = None
        self.in_sample_residuals     = None
        self.out_sample_residuals    = None
        self.fitted                  = False
        self.creation_date           = pd.Timestamp.today().strftime('%Y-%m-%d %H:%M:%S')
        self.fit_date                = None
        self.skforcast_version       = skforecast.__version__
        self.python_version          = sys.version.split(" ")[0]
        self.forecaster_id           = forecaster_id

        self.lags = initialize_lags(type(self).__name__, lags)
        self.max_lag = max(self.lags)
        self.window_size = self.max_lag

        self.weight_func, self.source_code_weight_func, self.series_weights = initialize_weights(
            forecaster_name = type(self).__name__, 
            regressor       = regressor, 
            weight_func     = weight_func, 
            series_weights  = series_weights
        )

        self.fit_kwargs = check_select_fit_kwargs(
                              regressor  = regressor,
                              fit_kwargs = fit_kwargs
                          )


    def __repr__(
        self
    ) -> str:
        """
        Information displayed when a ForecasterAutoregMultiSeries object is printed.
        """

        if isinstance(self.regressor, sklearn.pipeline.Pipeline):
            name_pipe_steps = tuple(name + "__" for name in self.regressor.named_steps.keys())
            params = {key : value for key, value in self.regressor.get_params().items() \
                      if key.startswith(name_pipe_steps)}
        else:
            params = self.regressor.get_params()

        info = (
            f"{'=' * len(type(self).__name__)} \n"
            f"{type(self).__name__} \n"
            f"{'=' * len(type(self).__name__)} \n"
            f"Regressor: {self.regressor} \n"
            f"Lags: {self.lags} \n"
            f"Transformer for series: {self.transformer_series} \n"
            f"Transformer for exog: {self.transformer_exog} \n"
            f"Window size: {self.window_size} \n"
            f"Series levels (names): {self.series_col_names} \n"
            f"Series weights: {self.series_weights} \n"
            f"Weight function included: {True if self.weight_func is not None else False} \n"
            f"Exogenous included: {self.included_exog} \n"
            f"Type of exogenous variable: {self.exog_type} \n"
            f"Exogenous variables names: {self.exog_col_names} \n"
            f"Training range: {self.training_range.to_list() if self.fitted else None} \n"
            f"Training index type: {str(self.index_type).split('.')[-1][:-2] if self.fitted else None} \n"
            f"Training index frequency: {self.index_freq if self.fitted else None} \n"
            f"Regressor parameters: {params} \n"
            f"fit_kwargs: {self.fit_kwargs} \n"
            f"Creation date: {self.creation_date} \n"
            f"Last fit date: {self.fit_date} \n"
            f"Skforecast version: {self.skforcast_version} \n"
            f"Python version: {self.python_version} \n"
            f"Forecaster id: {self.forecaster_id} \n"
        )

        return info


    def _create_lags(
        self, 
        y: np.ndarray
    ) -> Tuple[np.ndarray, np.ndarray]:
        """       
        Transforms a 1d array into a 2d array (X) and a 1d array (y). Each row
        in X is associated with a value of y and it represents the lags that
        precede it.

        Notice that, the returned matrix X_data, contains the lag 1 in the first
        column, the lag 2 in the second column and so on.

        Parameters
        ----------        
        y : 1d numpy ndarray
            Training time series.

        Returns 
        -------
        X_data : 2d numpy ndarray, shape (samples - max(self.lags), len(self.lags))
            2d numpy array with the lagged values (predictors).

        y_data : 1d numpy ndarray, shape (samples - max(self.lags),)
            Values of the time series related to each row of `X_data`.

        """

        n_splits = len(y) - self.max_lag
        if n_splits <= 0:
            raise ValueError(
                f"The maximum lag ({self.max_lag}) must be less than the length "
                f"of the series ({len(y)})."
            )

        X_data = np.full(shape=(n_splits, len(self.lags)), fill_value=np.nan, dtype=float)

        for i, lag in enumerate(self.lags):
            X_data[:, i] = y[self.max_lag - lag: -lag]

        y_data = y[self.max_lag:]

        return X_data, y_data


    def create_train_X_y(
        self,
        series: pd.DataFrame,
        exog: Optional[Union[pd.Series, pd.DataFrame]]=None
    ) -> Tuple[pd.DataFrame, pd.Series, pd.Index, pd.Index]:
        """
        Create training matrices from multiple time series and exogenous
        variables.

        Parameters
        ----------        
        series : pandas DataFrame
            Training time series.

        exog : pandas Series, pandas DataFrame, default `None`
            Exogenous variable/s included as predictor/s. Must have the same
            number of observations as `series` and their indexes must be aligned.

        Returns 
        -------
        X_train : pandas DataFrame
            Pandas DataFrame with the training values (predictors).

        y_train : pandas Series, shape (len(series) - self.max_lag, )
            Values (target) of the time series related to each row of `X_train`.

        y_index : pandas Index
            Index of `series`.

        y_train_index: pandas Index
            Index of `y_train`.

        """

        if not isinstance(series, pd.DataFrame):
            raise TypeError(f"`series` must be a pandas DataFrame. Got {type(series)}.")

        series_col_names = list(series.columns)

        if self.transformer_series is None:
            self.transformer_series_ = {serie: None for serie in series_col_names}
        elif not isinstance(self.transformer_series, dict):
            self.transformer_series_ = {serie: clone(self.transformer_series) 
                                        for serie in series_col_names}
        else:
            self.transformer_series_ = {serie: None for serie in series_col_names}
            # Only elements already present in transformer_series_ are updated
            self.transformer_series_.update(
                (k, v) for k, v in deepcopy(self.transformer_series).items() if k in self.transformer_series_
            )
            series_not_in_transformer_series = set(series.columns) - set(self.transformer_series.keys())
            if series_not_in_transformer_series:
                warnings.warn(
                    (f"{series_not_in_transformer_series} not present in `transformer_series`."
                     f" No transformation is applied to these series."),
                     IgnoredArgumentWarning
                )

        if exog is not None:
            if len(exog) != len(series):
                raise ValueError(
                    (f"`exog` must have same number of samples as `series`. "
                     f"length `exog`: ({len(exog)}), length `series`: ({len(series)})")
                )
            check_exog(exog=exog, allow_nan=True)
            if isinstance(exog, pd.Series):
                exog = transform_series(
                           series            = exog,
                           transformer       = self.transformer_exog,
                           fit               = True,
                           inverse_transform = False
                       )
            else:
                exog = transform_dataframe(
                           df                = exog,
                           transformer       = self.transformer_exog,
                           fit               = True,
                           inverse_transform = False
                       )

            check_exog(exog=exog, allow_nan=False)
            check_exog_dtypes(exog)
            self.exog_dtypes = get_exog_dtypes(exog=exog)

            _, exog_index = preprocess_exog(exog=exog, return_values=False)
            if not (exog_index[:len(series.index)] == series.index).all():
                raise ValueError(
                    ("Different index for `series` and `exog`. They must be equal "
                     "to ensure the correct alignment of values.")
                )

        X_levels = []
        X_train_col_names = [f"lag_{lag}" for lag in self.lags]

        for i, serie in enumerate(series.columns):

            y = series[serie]
            check_y(y=y)
            y = transform_series(
                    series            = y,
                    transformer       = self.transformer_series_[serie],
                    fit               = True,
                    inverse_transform = False
                )

            y_values, y_index = preprocess_y(y=y)
            X_train_values, y_train_values = self._create_lags(y=y_values)

            if i == 0:
                X_train = X_train_values
                y_train = y_train_values
            else:
                X_train = np.vstack((X_train, X_train_values))
                y_train = np.append(y_train, y_train_values)

            X_level = [serie]*len(X_train_values)
            X_levels.extend(X_level)

        X_levels = pd.Series(X_levels)
        X_levels = pd.get_dummies(X_levels, dtype=float)

        X_train = pd.DataFrame(
                      data    = X_train,
                      columns = X_train_col_names
                  )

        if exog is not None:
            # The first `self.max_lag` positions have to be removed from exog
            # since they are not in X_train. Then exog is cloned as many times
            # as series.
            exog_to_train = exog.iloc[self.max_lag:, ]
            exog_to_train = pd.concat([exog_to_train]*len(series_col_names)).reset_index(drop=True)
        else:
            exog_to_train = None

        X_train = pd.concat([X_train, exog_to_train, X_levels], axis=1)
        self.X_train_col_names = X_train.columns.to_list()

        y_train = pd.Series(
                      data = y_train,
                      name = 'y'
                  )

        y_train_index = pd.Index(
                            np.tile(
                                y_index[self.max_lag: ].to_numpy(),
                                reps = len(series_col_names)
                            )
                        )

        return X_train, y_train, y_index, y_train_index


    def create_sample_weights(
        self,
        series: pd.DataFrame,
        X_train: pd.DataFrame,
        y_train_index: pd.Index,
    )-> np.ndarray:
        """
        Crate weights for each observation according to the forecaster's attributes
        `series_weights` and `weight_func`. The resulting weights are product of both
        types of weights.

        Parameters
        ----------
        series : pandas DataFrame
            Time series used to create `X_train` with the method `create_train_X_y`.
        X_train : pandas DataFrame
            Dataframe generated with the method `create_train_X_y`, first return.
        y_train_index : pandas Index
            Index of `y_train` generated with the method `create_train_X_y`, fourth return.

        Returns
        -------
        weights : numpy ndarray
            Weights to use in `fit` method.

        """

        weights = None
        weights_samples = None
        weights_series = None

        if self.series_weights is not None:
            # Series not present in series_weights have a weight of 1 in all their samples.
            # Keys in series_weights not present in series are ignored.
            series_not_in_series_weights = set(series.columns) - set(self.series_weights.keys())
            if series_not_in_series_weights:
                warnings.warn(
                    (f"{series_not_in_series_weights} not present in `series_weights`. "
                     f"A weight of 1 is given to all their samples."),
                     IgnoredArgumentWarning
                )
            self.series_weights_ = {col: 1. for col in series.columns}
            self.series_weights_.update((k, v) for k, v in self.series_weights.items() if k in self.series_weights_)
            weights_series = [np.repeat(self.series_weights_[serie], sum(X_train[serie])) 
                              for serie in series.columns]
            weights_series = np.concatenate(weights_series)

        if self.weight_func is not None:
            if isinstance(self.weight_func, Callable):
                self.weight_func_ = {col: copy(self.weight_func) for col in series.columns}
            else:
                # Series not present in weight_func have a weight of 1 in all their samples
                series_not_in_weight_func = set(series.columns) - set(self.weight_func.keys())
                if series_not_in_weight_func:
                    warnings.warn(
                        (f"{series_not_in_weight_func} not present in `weight_func`. "
                         f"A weight of 1 is given to all their samples."),
                         IgnoredArgumentWarning
                    )
                self.weight_func_ = {col: lambda x: np.ones_like(x, dtype=float) for col in series.columns}
                self.weight_func_.update((k, v) for k, v in self.weight_func.items() if k in self.weight_func_)

            weights_samples = []
            for key in self.weight_func_.keys():
                idx = y_train_index[X_train[X_train[key] == 1.0].index]
                weights_samples.append(self.weight_func_[key](idx))
            weights_samples = np.concatenate(weights_samples)

        if weights_series is not None:
            weights = weights_series
            if weights_samples is not None:
                weights = weights * weights_samples
        else:
            if weights_samples is not None:
                weights = weights_samples

        if weights is not None:
            if np.isnan(weights).any():
                raise ValueError(
                    "The resulting `weights` cannot have NaN values."
                )
            if np.any(weights < 0):
                raise ValueError(
                    "The resulting `weights` cannot have negative values."
                )
            if np.sum(weights) == 0:
                raise ValueError(
                    ("The resulting `weights` cannot be normalized because "
                     "the sum of the weights is zero.")
                )

        return weights


    def fit(
        self,
        series: pd.DataFrame,
        exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
        store_in_sample_residuals: bool=True
    ) -> None:
        """
        Training Forecaster.

        Parameters
        ----------        
        series : pandas DataFrame
            Training time series.

        exog : pandas Series, pandas DataFrame, default `None`
            Exogenous variable/s included as predictor/s. Must have the same
            number of observations as `series` and their indexes must be aligned so
            that series[i] is regressed on exog[i].

        store_in_sample_residuals : bool, default `True`
            if True, in_sample_residuals are stored.

        Returns 
        -------
        None

        """

        # Reset values in case the forecaster has already been fitted.
        self.index_type          = None
        self.index_freq          = None
        self.index_values        = None
        self.last_window         = None
        self.included_exog       = False
        self.exog_type           = None
        self.exog_dtypes         = None
        self.exog_col_names      = None
        self.series_col_names    = None
        self.X_train_col_names   = None
        self.in_sample_residuals = None
        self.fitted              = False
        self.training_range      = None

        self.series_col_names = list(series.columns)        

        if exog is not None:
            self.included_exog = True
            self.exog_type = type(exog)
            self.exog_col_names = \
                 exog.columns.to_list() if isinstance(exog, pd.DataFrame) else [exog.name]

            if len(set(self.exog_col_names) - set(self.series_col_names)) != len(self.exog_col_names):
                raise ValueError(
                    (f"`exog` cannot contain a column named the same as one of the series"
                     f" (column names of series).\n"
                     f"    `series` columns : {self.series_col_names}.\n"
                     f"    `exog`   columns : {self.exog_col_names}.")
                )

        X_train, y_train, y_index, y_train_index = self.create_train_X_y(series=series, exog=exog)
        sample_weight = self.create_sample_weights(
                            series        = series,
                            X_train       = X_train,
                            y_train_index = y_train_index,
                        )

        if sample_weight is not None:
            self.regressor.fit(
                X             = X_train,
                y             = y_train,
                sample_weight = sample_weight,
                **self.fit_kwargs
            )
        else:
            self.regressor.fit(X=X_train, y=y_train, **self.fit_kwargs)

        self.fitted = True
        self.fit_date = pd.Timestamp.today().strftime('%Y-%m-%d %H:%M:%S')
        self.training_range = y_index[[0, -1]]
        self.index_type = type(y_index)
        if isinstance(y_index, pd.DatetimeIndex):
            self.index_freq = y_index.freqstr
        else: 
            self.index_freq = y_index.step
        self.index_values = y_index

        in_sample_residuals = {}

        # This is done to save time during fit in functions such as backtesting()
        if store_in_sample_residuals:

            residuals = y_train - self.regressor.predict(X_train)

            for serie in series.columns:
                in_sample_residuals[serie] = residuals.loc[X_train[serie] == 1.].to_numpy()
                if len(in_sample_residuals[serie]) > 1000:
                    # Only up to 1000 residuals are stored
                    rng = np.random.default_rng(seed=123)
                    in_sample_residuals[serie] = rng.choice(
                                                     a       = in_sample_residuals[serie], 
                                                     size    = 1000, 
                                                     replace = False
                                                 )
        else:
            for serie in series.columns:
                in_sample_residuals[serie] = None

        self.in_sample_residuals = in_sample_residuals

        # The last time window of training data is stored so that lags needed as
        # predictors in the first iteration of `predict()` can be calculated.
        self.last_window = series.iloc[-self.max_lag:, ].copy()


    def _recursive_predict(
        self,
        steps: int,
        level: str,
        last_window: np.ndarray,
        exog: Optional[np.ndarray]=None
    ) -> np.ndarray:
        """
        Predict n steps ahead. It is an iterative process in which, each prediction,
        is used as a predictor for the next step.

        Parameters
        ----------
        steps : int
            Number of future steps predicted.

        level : str
            Time series to be predicted.

        last_window : numpy ndarray
            Series values used to create the predictors (lags) needed in the 
            first iteration of the prediction (t + 1).

        exog : numpy ndarray, default `None`
            Exogenous variable/s included as predictor/s.

        Returns 
        -------
        predictions : numpy ndarray
            Predicted values.

        """

        predictions = np.full(shape=steps, fill_value=np.nan)

        for i in range(steps):
            X = last_window[-self.lags].reshape(1, -1)
            if exog is not None:
                X = np.column_stack((X, exog[i, ].reshape(1, -1)))

            levels_dummies = np.zeros(shape=(1, len(self.series_col_names)), dtype=float)
            levels_dummies[0][self.series_col_names.index(level)] = 1.

            X = np.column_stack((X, levels_dummies.reshape(1, -1)))

            with warnings.catch_warnings():
                # Suppress scikit-learn warning: "X does not have valid feature names,
                # but NoOpTransformer was fitted with feature names".
                warnings.simplefilter("ignore")
                prediction = self.regressor.predict(X)
                predictions[i] = prediction.ravel()[0]

            # Update `last_window` values. The first position is discarded and 
            # the new prediction is added at the end.
            last_window = np.append(last_window[1:], prediction)

        return predictions


    def predict(
        self,
        steps: int,
        levels: Optional[Union[str, list]]=None,
        last_window: Optional[pd.DataFrame]=None,
        exog: Optional[Union[pd.Series, pd.DataFrame]]=None
    ) -> pd.DataFrame:
        """
        Predict n steps ahead. It is an recursive process in which, each prediction,
        is used as a predictor for the next step.

        Parameters
        ----------
        steps : int
            Number of future steps predicted.

        levels : str, list, default `None`
            Time series to be predicted. If `None` all levels will be predicted.
            **New in version 0.6.0**

        last_window : pandas DataFrame, default `None`
            Series values used to create the predictors (lags) needed in the 
            first iteration of the prediction (t + 1).

            If `last_window = None`, the values stored in `self.last_window` are
            used to calculate the initial predictors, and the predictions start
            right after training data.

        exog : pandas Series, pandas DataFrame, default `None`
            Exogenous variable/s included as predictor/s.

        Returns
        -------
        predictions : pandas DataFrame
            Predicted values, one column for each level.

        """

        if levels is None:
            levels = self.series_col_names
        elif isinstance(levels, str):
            levels = [levels]

        if last_window is None:
            last_window = deepcopy(self.last_window)

        check_predict_input(
            forecaster_name  = type(self).__name__,
            steps            = steps,
            fitted           = self.fitted,
            included_exog    = self.included_exog,
            index_type       = self.index_type,
            index_freq       = self.index_freq,
            window_size      = self.window_size,
            last_window      = last_window,
            last_window_exog = None,
            exog             = exog,
            exog_type        = self.exog_type,
            exog_col_names   = self.exog_col_names,
            interval         = None,
            alpha            = None,
            max_steps        = None,
            levels           = levels,
            series_col_names = self.series_col_names
        )

        if exog is not None:
            if isinstance(exog, pd.DataFrame):
                exog = transform_dataframe(
                           df                = exog,
                           transformer       = self.transformer_exog,
                           fit               = False,
                           inverse_transform = False
                       )
            else:
                exog = transform_series(
                           series            = exog,
                           transformer       = self.transformer_exog,
                           fit               = False,
                           inverse_transform = False
                       )
            check_exog_dtypes(exog=exog)
            exog_values = exog.iloc[:steps, ].to_numpy()
        else:
            exog_values = None

        predictions = []

        for level in levels:

            last_window_level = transform_series(
                                    series            = last_window[level],
                                    transformer       = self.transformer_series_[level],
                                    fit               = False,
                                    inverse_transform = False
                                )
            last_window_values, last_window_index = preprocess_last_window(
                                                        last_window = last_window_level
                                                    )

            preds_level = self._recursive_predict(
                              steps       = steps,
                              level       = level,
                              last_window = copy(last_window_values),
                              exog        = copy(exog_values)
                          )

            preds_level = pd.Series(
                              data  = preds_level,
                              index = expand_index(
                                          index = last_window_index,
                                          steps = steps
                                      ),
                              name = level
                          )

            preds_level = transform_series(
                              series            = preds_level,
                              transformer       = self.transformer_series_[level],
                              fit               = False,
                              inverse_transform = True
                          )

            predictions.append(preds_level)    

        predictions = pd.concat(predictions, axis=1)

        return predictions


    def predict_bootstrapping(
        self,
        steps: int,
        levels: Optional[Union[str, list]]=None,
        last_window: Optional[pd.DataFrame]=None,
        exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
        n_boot: int=500,
        random_state: int=123,
        in_sample_residuals: bool=True
    ) -> dict:
        """
        Generate multiple forecasting predictions using a bootstrapping process. 
        By sampling from a collection of past observed errors (the residuals),
        each iteration of bootstrapping generates a different set of predictions. 
        See the Notes section for more information. 

        Parameters
        ----------   
        steps : int
            Number of future steps predicted.

        levels : str, list, default `None`
            Time series to be predicted. If `None` all levels will be predicted.

        last_window : pandas DataFrame, default `None`
            Series values used to create the predictors (lags) needed in the 
            first iteration of the prediction (t + 1).

            If `last_window = None`, the values stored in `self.last_window` are
            used to calculate the initial predictors, and the predictions start
            right after training data.

        exog : pandas Series, pandas DataFrame, default `None`
            Exogenous variable/s included as predictor/s.

        n_boot : int, default `500`
            Number of bootstrapping iterations used to estimate prediction
            intervals.

        random_state : int, default `123`
            Sets a seed to the random generator, so that boot intervals are always 
            deterministic.

        in_sample_residuals : bool, default `True`
            If `True`, residuals from the training data are used as proxy of
            prediction error to create prediction intervals. If `False`, out of
            sample residuals are used. In the latter case, the user should have
            calculated and stored the residuals within the forecaster (see
            `set_out_sample_residuals()`).

        Returns 
        -------
        boot_predictions : dict
            Predictions generated by bootstrapping for each level. 
            {level: pandas DataFrame, shape (steps, n_boot)}

        Notes
        -----
        More information about prediction intervals in forecasting:
        https://otexts.com/fpp3/prediction-intervals.html#prediction-intervals-from-bootstrapped-residuals
        Forecasting: Principles and Practice (3nd ed) Rob J Hyndman and George Athanasopoulos.

        """

        if levels is None:
            levels = self.series_col_names
        elif isinstance(levels, str):
            levels = [levels]

        if in_sample_residuals:
            if not set(levels).issubset(set(self.in_sample_residuals.keys())):
                raise ValueError(
                    (f"Not `forecaster.in_sample_residuals` for levels: "
                     f"{set(levels) - set(self.in_sample_residuals.keys())}.")
                )
            residuals_levels = self.in_sample_residuals
        else:
            if self.out_sample_residuals is None:
                raise ValueError(
                    ("`forecaster.out_sample_residuals` is `None`. Use "
                     "`in_sample_residuals=True` or method `set_out_sample_residuals()` "
                     "before `predict_interval()`, `predict_bootstrapping()` or "
                     "`predict_dist()`.")
                )
            else:
                if not set(levels).issubset(set(self.out_sample_residuals.keys())):
                    raise ValueError(
                        (f"Not `forecaster.out_sample_residuals` for levels: "
                         f"{set(levels) - set(self.out_sample_residuals.keys())}. "
                         f"Use method `set_out_sample_residuals()`.")
                    )
            residuals_levels = self.out_sample_residuals

        check_residuals = 'forecaster.in_sample_residuals' if in_sample_residuals else 'forecaster.out_sample_residuals'
        for level in levels:
            if residuals_levels[level] is None:
                raise ValueError(
                    (f"forecaster residuals for level '{level}' are `None`. Check `{check_residuals}`.")
                )
            elif (residuals_levels[level] == None).any():
                raise ValueError(
                    (f"forecaster residuals for level '{level}' contains `None` values. Check `{check_residuals}`.")
                )

        if last_window is None:
            last_window = deepcopy(self.last_window)

        check_predict_input(
            forecaster_name  = type(self).__name__,
            steps            = steps,
            fitted           = self.fitted,
            included_exog    = self.included_exog,
            index_type       = self.index_type,
            index_freq       = self.index_freq,
            window_size      = self.window_size,
            last_window      = last_window,
            last_window_exog = None,
            exog             = exog,
            exog_type        = self.exog_type,
            exog_col_names   = self.exog_col_names,
            interval         = None,
            alpha            = None,
            max_steps        = None,
            levels           = levels,
            series_col_names = self.series_col_names
        )

        if exog is not None:
            if isinstance(exog, pd.DataFrame):
                exog = transform_dataframe(
                           df                = exog,
                           transformer       = self.transformer_exog,
                           fit               = False,
                           inverse_transform = False
                       )
            else:
                exog = transform_series(
                           series            = exog,
                           transformer       = self.transformer_exog,
                           fit               = False,
                           inverse_transform = False
                       )
            exog_values = exog.iloc[:steps, ].to_numpy()
        else:
            exog_values = None

        boot_predictions = {}

        for level in levels:

            last_window_level = transform_series(
                                    series            = last_window[level],
                                    transformer       = self.transformer_series_[level],
                                    fit               = False,
                                    inverse_transform = False
                                )
            last_window_values, last_window_index = preprocess_last_window(
                                                        last_window = last_window_level
                                                    )

            level_boot_predictions = np.full(
                                         shape      = (steps, n_boot),
                                         fill_value = np.nan,
                                         dtype      = float
                                     )
            rng = np.random.default_rng(seed=random_state)
            seeds = rng.integers(low=0, high=10000, size=n_boot)

            residuals = residuals_levels[level]

            for i in range(n_boot):
                # In each bootstraping iteration the initial last_window and exog 
                # need to be restored.
                last_window_boot = last_window_values.copy()
                exog_boot = exog_values.copy() if exog is not None else None

                rng = np.random.default_rng(seed=seeds[i])
                sample_residuals = rng.choice(
                                       a       = residuals,
                                       size    = steps,
                                       replace = True
                                   )

                for step in range(steps):

                    prediction = self._recursive_predict(
                                     steps       = 1,
                                     level       = level,
                                     last_window = last_window_boot,
                                     exog        = exog_boot 
                                 )

                    prediction_with_residual = prediction + sample_residuals[step]
                    level_boot_predictions[step, i] = prediction_with_residual

                    last_window_boot = np.append(
                                           last_window_boot[1:],
                                           prediction_with_residual
                                       )
                    if exog is not None:
                        exog_boot = exog_boot[1:]

            level_boot_predictions = pd.DataFrame(
                                         data    = level_boot_predictions,
                                         index   = expand_index(last_window_index, steps=steps),
                                         columns = [f"pred_boot_{i}" for i in range(n_boot)]
                                     )

            if self.transformer_series_[level]:
                for col in level_boot_predictions.columns:
                    level_boot_predictions[col] = transform_series(
                                                      series            = level_boot_predictions[col],
                                                      transformer       = self.transformer_series_[level],
                                                      fit               = False,
                                                      inverse_transform = True
                                                  )

            boot_predictions[level] = level_boot_predictions

        return boot_predictions


    def predict_interval(
        self,
        steps: int,
        levels: Optional[Union[str, list]]=None,
        last_window: Optional[pd.DataFrame]=None,
        exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
        interval: list=[5, 95],
        n_boot: int=500,
        random_state: int=123,
        in_sample_residuals: bool=True
    ) -> pd.DataFrame:
        """
        Iterative process in which, each prediction, is used as a predictor
        for the next step and bootstrapping is used to estimate prediction
        intervals. Both predictions and intervals are returned.

        Parameters
        ---------- 
        steps : int
            Number of future steps predicted.

        levels : str, list, default `None`
            Time series to be predicted. If `None` all levels will be predicted.  
            **New in version 0.6.0**  

        last_window : pandas DataFrame, default `None`
            Series values used to create the predictors (lags) needed in the 
            first iteration of the prediction (t + 1).

            If `last_window = None`, the values stored in` self.last_window` are
            used to calculate the initial predictors, and the predictions start
            right after training data.

        exog : pandas Series, pandas DataFrame, default `None`
            Exogenous variable/s included as predictor/s.

        interval : list, default `[5, 95]`
            Confidence of the prediction interval estimated. Sequence of 
            percentiles to compute, which must be between 0 and 100 inclusive. 
            For example, interval of 95% should be as `interval = [2.5, 97.5]`.

        n_boot : int, default `500`
            Number of bootstrapping iterations used to estimate prediction
            intervals.

        random_state : int, default `123`
            Sets a seed to the random generator, so that boot intervals are always 
            deterministic.

        in_sample_residuals : bool, default `True`
            If `True`, residuals from the training data are used as proxy of
            prediction error to create prediction intervals. If `False`, out of
            sample residuals are used. In the latter case, the user should have
            calculated and stored the residuals within the forecaster (see
            `set_out_sample_residuals()`).

        Returns 
        -------
        predictions : pandas DataFrame
            Values predicted by the forecaster and their estimated interval.
                level: predictions.
                level_lower_bound: lower bound of the interval.
                level_upper_bound: upper bound interval of the interval.

        Notes
        -----
        More information about prediction intervals in forecasting:
        https://otexts.com/fpp2/prediction-intervals.html
        Forecasting: Principles and Practice (2nd ed) Rob J Hyndman and
        George Athanasopoulos.

        """

        if levels is None:
            levels = self.series_col_names
        elif isinstance(levels, str):
            levels = [levels]

        check_interval(interval=interval)

        preds = self.predict(
                    steps       = steps,
                    levels      = levels,
                    last_window = last_window,
                    exog        = exog
                )

        boot_predictions = self.predict_bootstrapping(
                               steps               = steps,
                               levels              = levels,
                               last_window         = last_window,
                               exog                = exog,
                               n_boot              = n_boot,
                               random_state        = random_state,
                               in_sample_residuals = in_sample_residuals
                           )

        interval = np.array(interval)/100
        predictions = []

        for level in levels:
            preds_interval = boot_predictions[level].quantile(q=interval, axis=1).transpose()
            preds_interval.columns = [f'{level}_lower_bound', f'{level}_upper_bound']
            predictions.append(preds[level])
            predictions.append(preds_interval)

        predictions = pd.concat(predictions, axis=1)

        return predictions


    def predict_dist(
        self,
        steps: int,
        distribution: object,
        levels: Optional[Union[str, list]]=None,
        last_window: Optional[pd.DataFrame]=None,
        exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
        n_boot: int=500,
        random_state: int=123,
        in_sample_residuals: bool=True
    ) -> pd.DataFrame:
        """
        Fit a given probability distribution for each step. After generating 
        multiple forecasting predictions through a bootstrapping process, each 
        step is fitted to the given distribution.

        Parameters
        ---------- 
        steps : int
            Number of future steps predicted.

        distribution : Object
            A distribution object from scipy.stats.

        levels : str, list, default `None`
            Time series to be predicted. If `None` all levels will be predicted.  
            **New in version 0.6.0**  

        last_window : pandas DataFrame, default `None`
            Series values used to create the predictors (lags) needed in the 
            first iteration of the prediction (t + 1).

            If `last_window = None`, the values stored in` self.last_window` are
            used to calculate the initial predictors, and the predictions start
            right after training data.

        exog : pandas Series, pandas DataFrame, default `None`
            Exogenous variable/s included as predictor/s.

        n_boot : int, default `500`
            Number of bootstrapping iterations used to estimate prediction
            intervals.

        random_state : int, default `123`
            Sets a seed to the random generator, so that boot intervals are always 
            deterministic.

        in_sample_residuals : bool, default `True`
            If `True`, residuals from the training data are used as proxy of
            prediction error to create prediction intervals. If `False`, out of
            sample residuals are used. In the latter case, the user should have
            calculated and stored the residuals within the forecaster (see
            `set_out_sample_residuals()`).

        Returns 
        -------
        predictions : pandas DataFrame
            Distribution parameters estimated for each step and level.

        """

        if levels is None:
            levels = self.series_col_names
        elif isinstance(levels, str):
            levels = [levels]

        boot_samples = self.predict_bootstrapping(
                           steps               = steps,
                           levels              = levels,
                           last_window         = last_window,
                           exog                = exog,
                           n_boot              = n_boot,
                           random_state        = random_state,
                           in_sample_residuals = in_sample_residuals
                       )

        param_names = [p for p in inspect.signature(distribution._pdf).parameters if not p=='x'] + ["loc","scale"]
        predictions = []

        for level in levels:
            param_values = np.apply_along_axis(lambda x: distribution.fit(x), axis=1, arr=boot_samples[level])
            level_param_names = [f'{level}_{p}' for p in param_names]

            pred_level = pd.DataFrame(
                             data    = param_values,
                             columns = level_param_names,
                             index   = boot_samples[level].index
                         )

            predictions.append(pred_level)

        predictions = pd.concat(predictions, axis=1)

        return predictions


    def set_params(
        self, 
        params: dict
    ) -> None:
        """
        Set new values to the parameters of the scikit learn model stored in the
        forecaster.

        Parameters
        ----------
        params : dict
            Parameters values.

        Returns 
        -------
        self

        """

        self.regressor = clone(self.regressor)
        self.regressor.set_params(**params)


    def set_fit_kwargs(
        self, 
        fit_kwargs: dict
    ) -> None:
        """
        Set new values for the additional keyword arguments passed to the `fit` 
        method of the regressor.

        Parameters
        ----------
        fit_kwargs : dict
            Dict of the form {"argument": new_value}.

        Returns 
        -------
        None

        """

        self.fit_kwargs = check_select_fit_kwargs(self.regressor, fit_kwargs=fit_kwargs)


    def set_lags(
        self, 
        lags: Union[int, list, np.ndarray, range]
    ) -> None:
        """      
        Set new value to the attribute `lags`.
        Attributes `max_lag` and `window_size` are also updated.

        Parameters
        ----------
        lags : int, list, 1D np.array, range
            Lags used as predictors. Index starts at 1, so lag 1 is equal to t-1.
                `int`: include lags from 1 to `lags`.
                `list` or `np.array`: include only lags present in `lags`.

        Returns 
        -------
        None

        """

        self.lags = initialize_lags(type(self).__name__, lags)            
        self.max_lag  = max(self.lags)
        self.window_size = max(self.lags)


    def set_out_sample_residuals(
        self, 
        residuals: dict,
        append: bool=True,
        transform: bool=True,
        random_state: int=123
    )-> None:
        """
        Set new values to the attribute `out_sample_residuals`. Out of sample
        residuals are meant to be calculated using observations that did not
        participate in the training process.

        Parameters
        ----------
        residuals : dict
            Dictionary of numpy ndarrays with the residuals of each level in the
            form {level: residuals}. If len(residuals) > 1000, only a random 
            sample of 1000 values are stored. Keys must be the same as `levels`.

        append : bool, default `True`
            If `True`, new residuals are added to the once already stored in the
            attribute `out_sample_residuals`. Once the limit of 1000 values is
            reached, no more values are appended. If False, `out_sample_residuals`
            is overwritten with the new residuals.

        transform : bool, default `True`
            If `True`, new residuals are transformed using self.transformer_series.

        random_state : int, default `123`
            Sets a seed to the random sampling for reproducible output.

        Returns 
        -------
        self

        """

        if not isinstance(residuals, dict) or not all(isinstance(x, np.ndarray) for x in residuals.values()):
            raise TypeError(
                f"`residuals` argument must be a dict of numpy ndarrays in the form "
                "`{level: residuals}`. " 
                f"Got {type(residuals)}."
            )

        if not self.fitted:
            raise sklearn.exceptions.NotFittedError(
                ("This forecaster is not fitted yet. Call `fit` with appropriate "
                 "arguments before using `set_out_sample_residuals()`.")
            )

        if self.out_sample_residuals is None:
            self.out_sample_residuals = {level: None for level in self.series_col_names}

        if not set(self.out_sample_residuals.keys()).issubset(set(residuals.keys())):
            warnings.warn(
                (f"""
                Only residuals of levels 
                {set(self.out_sample_residuals.keys()).intersection(set(residuals.keys()))} 
                are updated.
                """), IgnoredArgumentWarning
            )

        residuals = {key: value for key, value in residuals.items() if key in self.out_sample_residuals.keys()}

        for level, value in residuals.items():

            residuals_level = value

            if not transform and self.transformer_series_[level] is not None:
                warnings.warn(
                    ("Argument `transform` is set to `False` but forecaster was "
                    f"trained using a transformer {self.transformer_series_[level]} "
                    f"for level {level}. Ensure that the new residuals are "
                     "already transformed or set `transform=True`.")
                )

            if transform and self.transformer_series_ and self.transformer_series_[level]:
                warnings.warn(
                    ("Residuals will be transformed using the same transformer used "
                    f"when training the forecaster for level {level} : "
                    f"({self.transformer_series_[level]}). Ensure that the new "
                     "residuals are on the same scale as the original time series.")
                )
                residuals_level = transform_series(
                                      series            = pd.Series(residuals_level, name='residuals'),
                                      transformer       = self.transformer_series_[level],
                                      fit               = False,
                                      inverse_transform = False
                                  ).to_numpy()

            if len(residuals_level) > 1000:
                rng = np.random.default_rng(seed=random_state)
                residuals_level = rng.choice(a=residuals_level, size=1000, replace=False)

            if append and self.out_sample_residuals[level] is not None:
                free_space = max(0, 1000 - len(self.out_sample_residuals[level]))
                if len(residuals_level) < free_space:
                    residuals_level = np.hstack((
                                            self.out_sample_residuals[level],
                                            residuals_level
                                        ))
                else:
                    residuals_level = np.hstack((
                                            self.out_sample_residuals[level],
                                            residuals_level[:free_space]
                                        ))

            self.out_sample_residuals[level] = residuals_level


    def get_feature_importances(
        self
    ) -> pd.DataFrame:
        """
        Return feature importances of the regressor stored in the
        forecaster. Only valid when regressor stores internally the feature
        importances in the attribute `feature_importances_` or `coef_`.

        Parameters
        ----------
        self

        Returns
        -------
        feature_importances : pandas DataFrame
            Feature importances associated with each predictor.

        """

        if not self.fitted:
            raise sklearn.exceptions.NotFittedError(
                ("This forecaster is not fitted yet. Call `fit` with appropriate "
                 "arguments before using `get_feature_importances()`.")
            )

        if isinstance(self.regressor, sklearn.pipeline.Pipeline):
            estimator = self.regressor[-1]
        else:
            estimator = self.regressor

        if hasattr(estimator, 'feature_importances_'):
            feature_importances = estimator.feature_importances_
        elif hasattr(estimator, 'coef_'):
            feature_importances = estimator.coef_
        else:
            warnings.warn(
                (f"Impossible to access feature importances for regressor of type "
                 f"{type(estimator)}. This method is only valid when the "
                 f"regressor stores internally the feature importances in the "
                 f"attribute `feature_importances_` or `coef_`.")
            )
            feature_importances = None

        if feature_importances is not None:
            feature_importances = pd.DataFrame({
                                      'feature': self.X_train_col_names,
                                      'importance': feature_importances
                                  })

        return feature_importances


    def get_feature_importance(
        self
    ) -> pd.DataFrame:
        """
        This method has been replaced by `get_feature_importances()`.

        Return feature importances of the regressor stored in the
        forecaster. Only valid when regressor stores internally the feature
        importances in the attribute `feature_importances_` or `coef_`.

        Parameters
        ----------
        self

        Returns
        -------
        feature_importances : pandas DataFrame
            Feature importances associated with each predictor.

        """

        warnings.warn(
            ("get_feature_importance() method has been renamed to get_feature_importances()."
             "This method will be removed in skforecast 0.9.0.")
        )

        return self.get_feature_importances()

`create_sample_weights(self, series, X_train, y_train_index)` ¶

Crate weights for each observation according to the forecaster's attributes

series_weights and weight_func. The resulting weights are product of both types of weights.

Parameters:

Name	Type	Description	Default
`series`	`DataFrame`	Time series used to create `X_train` with the method `create_train_X_y`.	required
`X_train`	`DataFrame`	Dataframe generated with the method `create_train_X_y`, first return.	required
`y_train_index`	`Index`	Index of `y_train` generated with the method `create_train_X_y`, fourth return.	required

Returns:

Type	Description
`ndarray`	Weights to use in `fit` method.

Source code in skforecast/ForecasterAutoregMultiSeries/ForecasterAutoregMultiSeries.py

def create_sample_weights(
    self,
    series: pd.DataFrame,
    X_train: pd.DataFrame,
    y_train_index: pd.Index,
)-> np.ndarray:
    """
    Crate weights for each observation according to the forecaster's attributes
    `series_weights` and `weight_func`. The resulting weights are product of both
    types of weights.

    Parameters
    ----------
    series : pandas DataFrame
        Time series used to create `X_train` with the method `create_train_X_y`.
    X_train : pandas DataFrame
        Dataframe generated with the method `create_train_X_y`, first return.
    y_train_index : pandas Index
        Index of `y_train` generated with the method `create_train_X_y`, fourth return.

    Returns
    -------
    weights : numpy ndarray
        Weights to use in `fit` method.

    """

    weights = None
    weights_samples = None
    weights_series = None

    if self.series_weights is not None:
        # Series not present in series_weights have a weight of 1 in all their samples.
        # Keys in series_weights not present in series are ignored.
        series_not_in_series_weights = set(series.columns) - set(self.series_weights.keys())
        if series_not_in_series_weights:
            warnings.warn(
                (f"{series_not_in_series_weights} not present in `series_weights`. "
                 f"A weight of 1 is given to all their samples."),
                 IgnoredArgumentWarning
            )
        self.series_weights_ = {col: 1. for col in series.columns}
        self.series_weights_.update((k, v) for k, v in self.series_weights.items() if k in self.series_weights_)
        weights_series = [np.repeat(self.series_weights_[serie], sum(X_train[serie])) 
                          for serie in series.columns]
        weights_series = np.concatenate(weights_series)

    if self.weight_func is not None:
        if isinstance(self.weight_func, Callable):
            self.weight_func_ = {col: copy(self.weight_func) for col in series.columns}
        else:
            # Series not present in weight_func have a weight of 1 in all their samples
            series_not_in_weight_func = set(series.columns) - set(self.weight_func.keys())
            if series_not_in_weight_func:
                warnings.warn(
                    (f"{series_not_in_weight_func} not present in `weight_func`. "
                     f"A weight of 1 is given to all their samples."),
                     IgnoredArgumentWarning
                )
            self.weight_func_ = {col: lambda x: np.ones_like(x, dtype=float) for col in series.columns}
            self.weight_func_.update((k, v) for k, v in self.weight_func.items() if k in self.weight_func_)

        weights_samples = []
        for key in self.weight_func_.keys():
            idx = y_train_index[X_train[X_train[key] == 1.0].index]
            weights_samples.append(self.weight_func_[key](idx))
        weights_samples = np.concatenate(weights_samples)

    if weights_series is not None:
        weights = weights_series
        if weights_samples is not None:
            weights = weights * weights_samples
    else:
        if weights_samples is not None:
            weights = weights_samples

    if weights is not None:
        if np.isnan(weights).any():
            raise ValueError(
                "The resulting `weights` cannot have NaN values."
            )
        if np.any(weights < 0):
            raise ValueError(
                "The resulting `weights` cannot have negative values."
            )
        if np.sum(weights) == 0:
            raise ValueError(
                ("The resulting `weights` cannot be normalized because "
                 "the sum of the weights is zero.")
            )

    return weights

`create_train_X_y(self, series, exog=None)` ¶

Create training matrices from multiple time series and exogenous

variables.

Parameters:

Name	Type	Description	Default
`series`	`DataFrame`	Training time series.	required
`exog`	`Union[pandas.core.series.Series, pandas.core.frame.DataFrame]`	Exogenous variable/s included as predictor/s. Must have the same number of observations as `series` and their indexes must be aligned.	`None`

Returns:

Type	Description
`Tuple[pandas.core.frame.DataFrame, pandas.core.series.Series, pandas.core.indexes.base.Index, pandas.core.indexes.base.Index]`	Pandas DataFrame with the training values (predictors).

Source code in skforecast/ForecasterAutoregMultiSeries/ForecasterAutoregMultiSeries.py

def create_train_X_y(
    self,
    series: pd.DataFrame,
    exog: Optional[Union[pd.Series, pd.DataFrame]]=None
) -> Tuple[pd.DataFrame, pd.Series, pd.Index, pd.Index]:
    """
    Create training matrices from multiple time series and exogenous
    variables.

    Parameters
    ----------        
    series : pandas DataFrame
        Training time series.

    exog : pandas Series, pandas DataFrame, default `None`
        Exogenous variable/s included as predictor/s. Must have the same
        number of observations as `series` and their indexes must be aligned.

    Returns 
    -------
    X_train : pandas DataFrame
        Pandas DataFrame with the training values (predictors).

    y_train : pandas Series, shape (len(series) - self.max_lag, )
        Values (target) of the time series related to each row of `X_train`.

    y_index : pandas Index
        Index of `series`.

    y_train_index: pandas Index
        Index of `y_train`.

    """

    if not isinstance(series, pd.DataFrame):
        raise TypeError(f"`series` must be a pandas DataFrame. Got {type(series)}.")

    series_col_names = list(series.columns)

    if self.transformer_series is None:
        self.transformer_series_ = {serie: None for serie in series_col_names}
    elif not isinstance(self.transformer_series, dict):
        self.transformer_series_ = {serie: clone(self.transformer_series) 
                                    for serie in series_col_names}
    else:
        self.transformer_series_ = {serie: None for serie in series_col_names}
        # Only elements already present in transformer_series_ are updated
        self.transformer_series_.update(
            (k, v) for k, v in deepcopy(self.transformer_series).items() if k in self.transformer_series_
        )
        series_not_in_transformer_series = set(series.columns) - set(self.transformer_series.keys())
        if series_not_in_transformer_series:
            warnings.warn(
                (f"{series_not_in_transformer_series} not present in `transformer_series`."
                 f" No transformation is applied to these series."),
                 IgnoredArgumentWarning
            )

    if exog is not None:
        if len(exog) != len(series):
            raise ValueError(
                (f"`exog` must have same number of samples as `series`. "
                 f"length `exog`: ({len(exog)}), length `series`: ({len(series)})")
            )
        check_exog(exog=exog, allow_nan=True)
        if isinstance(exog, pd.Series):
            exog = transform_series(
                       series            = exog,
                       transformer       = self.transformer_exog,
                       fit               = True,
                       inverse_transform = False
                   )
        else:
            exog = transform_dataframe(
                       df                = exog,
                       transformer       = self.transformer_exog,
                       fit               = True,
                       inverse_transform = False
                   )

        check_exog(exog=exog, allow_nan=False)
        check_exog_dtypes(exog)
        self.exog_dtypes = get_exog_dtypes(exog=exog)

        _, exog_index = preprocess_exog(exog=exog, return_values=False)
        if not (exog_index[:len(series.index)] == series.index).all():
            raise ValueError(
                ("Different index for `series` and `exog`. They must be equal "
                 "to ensure the correct alignment of values.")
            )

    X_levels = []
    X_train_col_names = [f"lag_{lag}" for lag in self.lags]

    for i, serie in enumerate(series.columns):

        y = series[serie]
        check_y(y=y)
        y = transform_series(
                series            = y,
                transformer       = self.transformer_series_[serie],
                fit               = True,
                inverse_transform = False
            )

        y_values, y_index = preprocess_y(y=y)
        X_train_values, y_train_values = self._create_lags(y=y_values)

        if i == 0:
            X_train = X_train_values
            y_train = y_train_values
        else:
            X_train = np.vstack((X_train, X_train_values))
            y_train = np.append(y_train, y_train_values)

        X_level = [serie]*len(X_train_values)
        X_levels.extend(X_level)

    X_levels = pd.Series(X_levels)
    X_levels = pd.get_dummies(X_levels, dtype=float)

    X_train = pd.DataFrame(
                  data    = X_train,
                  columns = X_train_col_names
              )

    if exog is not None:
        # The first `self.max_lag` positions have to be removed from exog
        # since they are not in X_train. Then exog is cloned as many times
        # as series.
        exog_to_train = exog.iloc[self.max_lag:, ]
        exog_to_train = pd.concat([exog_to_train]*len(series_col_names)).reset_index(drop=True)
    else:
        exog_to_train = None

    X_train = pd.concat([X_train, exog_to_train, X_levels], axis=1)
    self.X_train_col_names = X_train.columns.to_list()

    y_train = pd.Series(
                  data = y_train,
                  name = 'y'
              )

    y_train_index = pd.Index(
                        np.tile(
                            y_index[self.max_lag: ].to_numpy(),
                            reps = len(series_col_names)
                        )
                    )

    return X_train, y_train, y_index, y_train_index

`fit(self, series, exog=None, store_in_sample_residuals=True)` ¶

Training Forecaster.

Parameters:

Name	Type	Description	Default
`series`	`DataFrame`	Training time series.	required
`exog`	`Union[pandas.core.series.Series, pandas.core.frame.DataFrame]`	Exogenous variable/s included as predictor/s. Must have the same number of observations as `series` and their indexes must be aligned so that series[i] is regressed on exog[i].	`None`
`store_in_sample_residuals`	`bool`	if True, in_sample_residuals are stored.	`True`

Source code in skforecast/ForecasterAutoregMultiSeries/ForecasterAutoregMultiSeries.py

def fit(
    self,
    series: pd.DataFrame,
    exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
    store_in_sample_residuals: bool=True
) -> None:
    """
    Training Forecaster.

    Parameters
    ----------        
    series : pandas DataFrame
        Training time series.

    exog : pandas Series, pandas DataFrame, default `None`
        Exogenous variable/s included as predictor/s. Must have the same
        number of observations as `series` and their indexes must be aligned so
        that series[i] is regressed on exog[i].

    store_in_sample_residuals : bool, default `True`
        if True, in_sample_residuals are stored.

    Returns 
    -------
    None

    """

    # Reset values in case the forecaster has already been fitted.
    self.index_type          = None
    self.index_freq          = None
    self.index_values        = None
    self.last_window         = None
    self.included_exog       = False
    self.exog_type           = None
    self.exog_dtypes         = None
    self.exog_col_names      = None
    self.series_col_names    = None
    self.X_train_col_names   = None
    self.in_sample_residuals = None
    self.fitted              = False
    self.training_range      = None

    self.series_col_names = list(series.columns)        

    if exog is not None:
        self.included_exog = True
        self.exog_type = type(exog)
        self.exog_col_names = \
             exog.columns.to_list() if isinstance(exog, pd.DataFrame) else [exog.name]

        if len(set(self.exog_col_names) - set(self.series_col_names)) != len(self.exog_col_names):
            raise ValueError(
                (f"`exog` cannot contain a column named the same as one of the series"
                 f" (column names of series).\n"
                 f"    `series` columns : {self.series_col_names}.\n"
                 f"    `exog`   columns : {self.exog_col_names}.")
            )

    X_train, y_train, y_index, y_train_index = self.create_train_X_y(series=series, exog=exog)
    sample_weight = self.create_sample_weights(
                        series        = series,
                        X_train       = X_train,
                        y_train_index = y_train_index,
                    )

    if sample_weight is not None:
        self.regressor.fit(
            X             = X_train,
            y             = y_train,
            sample_weight = sample_weight,
            **self.fit_kwargs
        )
    else:
        self.regressor.fit(X=X_train, y=y_train, **self.fit_kwargs)

    self.fitted = True
    self.fit_date = pd.Timestamp.today().strftime('%Y-%m-%d %H:%M:%S')
    self.training_range = y_index[[0, -1]]
    self.index_type = type(y_index)
    if isinstance(y_index, pd.DatetimeIndex):
        self.index_freq = y_index.freqstr
    else: 
        self.index_freq = y_index.step
    self.index_values = y_index

    in_sample_residuals = {}

    # This is done to save time during fit in functions such as backtesting()
    if store_in_sample_residuals:

        residuals = y_train - self.regressor.predict(X_train)

        for serie in series.columns:
            in_sample_residuals[serie] = residuals.loc[X_train[serie] == 1.].to_numpy()
            if len(in_sample_residuals[serie]) > 1000:
                # Only up to 1000 residuals are stored
                rng = np.random.default_rng(seed=123)
                in_sample_residuals[serie] = rng.choice(
                                                 a       = in_sample_residuals[serie], 
                                                 size    = 1000, 
                                                 replace = False
                                             )
    else:
        for serie in series.columns:
            in_sample_residuals[serie] = None

    self.in_sample_residuals = in_sample_residuals

    # The last time window of training data is stored so that lags needed as
    # predictors in the first iteration of `predict()` can be calculated.
    self.last_window = series.iloc[-self.max_lag:, ].copy()

`get_feature_importance(self)` ¶

This method has been replaced by get_feature_importances().

Return feature importances of the regressor stored in the forecaster. Only valid when regressor stores internally the feature importances in the attribute feature_importances_ or coef_.

Parameters:

Name	Type	Description	Default
`self`	`None`		required

Returns:

Type	Description
`DataFrame`	Feature importances associated with each predictor.

Source code in skforecast/ForecasterAutoregMultiSeries/ForecasterAutoregMultiSeries.py

def get_feature_importance(
    self
) -> pd.DataFrame:
    """
    This method has been replaced by `get_feature_importances()`.

    Return feature importances of the regressor stored in the
    forecaster. Only valid when regressor stores internally the feature
    importances in the attribute `feature_importances_` or `coef_`.

    Parameters
    ----------
    self

    Returns
    -------
    feature_importances : pandas DataFrame
        Feature importances associated with each predictor.

    """

    warnings.warn(
        ("get_feature_importance() method has been renamed to get_feature_importances()."
         "This method will be removed in skforecast 0.9.0.")
    )

    return self.get_feature_importances()

`get_feature_importances(self)` ¶

Return feature importances of the regressor stored in the

forecaster. Only valid when regressor stores internally the feature importances in the attribute feature_importances_ or coef_.

Parameters:

Name	Type	Description	Default
`self`	`None`		required

Returns:

Type	Description
`DataFrame`	Feature importances associated with each predictor.

Source code in skforecast/ForecasterAutoregMultiSeries/ForecasterAutoregMultiSeries.py

def get_feature_importances(
    self
) -> pd.DataFrame:
    """
    Return feature importances of the regressor stored in the
    forecaster. Only valid when regressor stores internally the feature
    importances in the attribute `feature_importances_` or `coef_`.

    Parameters
    ----------
    self

    Returns
    -------
    feature_importances : pandas DataFrame
        Feature importances associated with each predictor.

    """

    if not self.fitted:
        raise sklearn.exceptions.NotFittedError(
            ("This forecaster is not fitted yet. Call `fit` with appropriate "
             "arguments before using `get_feature_importances()`.")
        )

    if isinstance(self.regressor, sklearn.pipeline.Pipeline):
        estimator = self.regressor[-1]
    else:
        estimator = self.regressor

    if hasattr(estimator, 'feature_importances_'):
        feature_importances = estimator.feature_importances_
    elif hasattr(estimator, 'coef_'):
        feature_importances = estimator.coef_
    else:
        warnings.warn(
            (f"Impossible to access feature importances for regressor of type "
             f"{type(estimator)}. This method is only valid when the "
             f"regressor stores internally the feature importances in the "
             f"attribute `feature_importances_` or `coef_`.")
        )
        feature_importances = None

    if feature_importances is not None:
        feature_importances = pd.DataFrame({
                                  'feature': self.X_train_col_names,
                                  'importance': feature_importances
                              })

    return feature_importances

`predict(self, steps, levels=None, last_window=None, exog=None)` ¶

Predict n steps ahead. It is an recursive process in which, each prediction,

is used as a predictor for the next step.

Parameters:

Name	Type	Description	Default
`steps`	`int`	Number of future steps predicted.	required
`levels`	`Union[str, list]`	Time series to be predicted. If `None` all levels will be predicted. New in version 0.6.0	`None`
`last_window`	`Optional[pandas.core.frame.DataFrame]`	Series values used to create the predictors (lags) needed in the first iteration of the prediction (t + 1). If `last_window = None`, the values stored in `self.last_window` are used to calculate the initial predictors, and the predictions start right after training data.	`None`
`exog`	`Union[pandas.core.series.Series, pandas.core.frame.DataFrame]`	Exogenous variable/s included as predictor/s.	`None`

Returns:

Type	Description
`DataFrame`	Predicted values, one column for each level.

Source code in skforecast/ForecasterAutoregMultiSeries/ForecasterAutoregMultiSeries.py

def predict(
    self,
    steps: int,
    levels: Optional[Union[str, list]]=None,
    last_window: Optional[pd.DataFrame]=None,
    exog: Optional[Union[pd.Series, pd.DataFrame]]=None
) -> pd.DataFrame:
    """
    Predict n steps ahead. It is an recursive process in which, each prediction,
    is used as a predictor for the next step.

    Parameters
    ----------
    steps : int
        Number of future steps predicted.

    levels : str, list, default `None`
        Time series to be predicted. If `None` all levels will be predicted.
        **New in version 0.6.0**

    last_window : pandas DataFrame, default `None`
        Series values used to create the predictors (lags) needed in the 
        first iteration of the prediction (t + 1).

        If `last_window = None`, the values stored in `self.last_window` are
        used to calculate the initial predictors, and the predictions start
        right after training data.

    exog : pandas Series, pandas DataFrame, default `None`
        Exogenous variable/s included as predictor/s.

    Returns
    -------
    predictions : pandas DataFrame
        Predicted values, one column for each level.

    """

    if levels is None:
        levels = self.series_col_names
    elif isinstance(levels, str):
        levels = [levels]

    if last_window is None:
        last_window = deepcopy(self.last_window)

    check_predict_input(
        forecaster_name  = type(self).__name__,
        steps            = steps,
        fitted           = self.fitted,
        included_exog    = self.included_exog,
        index_type       = self.index_type,
        index_freq       = self.index_freq,
        window_size      = self.window_size,
        last_window      = last_window,
        last_window_exog = None,
        exog             = exog,
        exog_type        = self.exog_type,
        exog_col_names   = self.exog_col_names,
        interval         = None,
        alpha            = None,
        max_steps        = None,
        levels           = levels,
        series_col_names = self.series_col_names
    )

    if exog is not None:
        if isinstance(exog, pd.DataFrame):
            exog = transform_dataframe(
                       df                = exog,
                       transformer       = self.transformer_exog,
                       fit               = False,
                       inverse_transform = False
                   )
        else:
            exog = transform_series(
                       series            = exog,
                       transformer       = self.transformer_exog,
                       fit               = False,
                       inverse_transform = False
                   )
        check_exog_dtypes(exog=exog)
        exog_values = exog.iloc[:steps, ].to_numpy()
    else:
        exog_values = None

    predictions = []

    for level in levels:

        last_window_level = transform_series(
                                series            = last_window[level],
                                transformer       = self.transformer_series_[level],
                                fit               = False,
                                inverse_transform = False
                            )
        last_window_values, last_window_index = preprocess_last_window(
                                                    last_window = last_window_level
                                                )

        preds_level = self._recursive_predict(
                          steps       = steps,
                          level       = level,
                          last_window = copy(last_window_values),
                          exog        = copy(exog_values)
                      )

        preds_level = pd.Series(
                          data  = preds_level,
                          index = expand_index(
                                      index = last_window_index,
                                      steps = steps
                                  ),
                          name = level
                      )

        preds_level = transform_series(
                          series            = preds_level,
                          transformer       = self.transformer_series_[level],
                          fit               = False,
                          inverse_transform = True
                      )

        predictions.append(preds_level)    

    predictions = pd.concat(predictions, axis=1)

    return predictions

`predict_bootstrapping(self, steps, levels=None, last_window=None, exog=None, n_boot=500, random_state=123, in_sample_residuals=True)` ¶

Generate multiple forecasting predictions using a bootstrapping process.

By sampling from a collection of past observed errors (the residuals), each iteration of bootstrapping generates a different set of predictions. See the Notes section for more information.

Parameters:

Name	Type	Description	Default
`steps`	`int`	Number of future steps predicted.	required
`levels`	`Union[str, list]`	Time series to be predicted. If `None` all levels will be predicted.	`None`
`last_window`	`Optional[pandas.core.frame.DataFrame]`	Series values used to create the predictors (lags) needed in the first iteration of the prediction (t + 1). If `last_window = None`, the values stored in `self.last_window` are used to calculate the initial predictors, and the predictions start right after training data.	`None`
`exog`	`Union[pandas.core.series.Series, pandas.core.frame.DataFrame]`	Exogenous variable/s included as predictor/s.	`None`
`n_boot`	`int`	Number of bootstrapping iterations used to estimate prediction intervals.	`500`
`random_state`	`int`	Sets a seed to the random generator, so that boot intervals are always deterministic.	`123`
`in_sample_residuals`	`bool`	If `True`, residuals from the training data are used as proxy of prediction error to create prediction intervals. If `False`, out of sample residuals are used. In the latter case, the user should have calculated and stored the residuals within the forecaster (see `set_out_sample_residuals()`).	`True`

Returns:

Type	Description
`dict`	Predictions generated by bootstrapping for each level. {level: pandas DataFrame, shape (steps, n_boot)}

Source code in skforecast/ForecasterAutoregMultiSeries/ForecasterAutoregMultiSeries.py

def predict_bootstrapping(
    self,
    steps: int,
    levels: Optional[Union[str, list]]=None,
    last_window: Optional[pd.DataFrame]=None,
    exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
    n_boot: int=500,
    random_state: int=123,
    in_sample_residuals: bool=True
) -> dict:
    """
    Generate multiple forecasting predictions using a bootstrapping process. 
    By sampling from a collection of past observed errors (the residuals),
    each iteration of bootstrapping generates a different set of predictions. 
    See the Notes section for more information. 

    Parameters
    ----------   
    steps : int
        Number of future steps predicted.

    levels : str, list, default `None`
        Time series to be predicted. If `None` all levels will be predicted.

    last_window : pandas DataFrame, default `None`
        Series values used to create the predictors (lags) needed in the 
        first iteration of the prediction (t + 1).

        If `last_window = None`, the values stored in `self.last_window` are
        used to calculate the initial predictors, and the predictions start
        right after training data.

    exog : pandas Series, pandas DataFrame, default `None`
        Exogenous variable/s included as predictor/s.

    n_boot : int, default `500`
        Number of bootstrapping iterations used to estimate prediction
        intervals.

    random_state : int, default `123`
        Sets a seed to the random generator, so that boot intervals are always 
        deterministic.

    in_sample_residuals : bool, default `True`
        If `True`, residuals from the training data are used as proxy of
        prediction error to create prediction intervals. If `False`, out of
        sample residuals are used. In the latter case, the user should have
        calculated and stored the residuals within the forecaster (see
        `set_out_sample_residuals()`).

    Returns 
    -------
    boot_predictions : dict
        Predictions generated by bootstrapping for each level. 
        {level: pandas DataFrame, shape (steps, n_boot)}

    Notes
    -----
    More information about prediction intervals in forecasting:
    https://otexts.com/fpp3/prediction-intervals.html#prediction-intervals-from-bootstrapped-residuals
    Forecasting: Principles and Practice (3nd ed) Rob J Hyndman and George Athanasopoulos.

    """

    if levels is None:
        levels = self.series_col_names
    elif isinstance(levels, str):
        levels = [levels]

    if in_sample_residuals:
        if not set(levels).issubset(set(self.in_sample_residuals.keys())):
            raise ValueError(
                (f"Not `forecaster.in_sample_residuals` for levels: "
                 f"{set(levels) - set(self.in_sample_residuals.keys())}.")
            )
        residuals_levels = self.in_sample_residuals
    else:
        if self.out_sample_residuals is None:
            raise ValueError(
                ("`forecaster.out_sample_residuals` is `None`. Use "
                 "`in_sample_residuals=True` or method `set_out_sample_residuals()` "
                 "before `predict_interval()`, `predict_bootstrapping()` or "
                 "`predict_dist()`.")
            )
        else:
            if not set(levels).issubset(set(self.out_sample_residuals.keys())):
                raise ValueError(
                    (f"Not `forecaster.out_sample_residuals` for levels: "
                     f"{set(levels) - set(self.out_sample_residuals.keys())}. "
                     f"Use method `set_out_sample_residuals()`.")
                )
        residuals_levels = self.out_sample_residuals

    check_residuals = 'forecaster.in_sample_residuals' if in_sample_residuals else 'forecaster.out_sample_residuals'
    for level in levels:
        if residuals_levels[level] is None:
            raise ValueError(
                (f"forecaster residuals for level '{level}' are `None`. Check `{check_residuals}`.")
            )
        elif (residuals_levels[level] == None).any():
            raise ValueError(
                (f"forecaster residuals for level '{level}' contains `None` values. Check `{check_residuals}`.")
            )

    if last_window is None:
        last_window = deepcopy(self.last_window)

    check_predict_input(
        forecaster_name  = type(self).__name__,
        steps            = steps,
        fitted           = self.fitted,
        included_exog    = self.included_exog,
        index_type       = self.index_type,
        index_freq       = self.index_freq,
        window_size      = self.window_size,
        last_window      = last_window,
        last_window_exog = None,
        exog             = exog,
        exog_type        = self.exog_type,
        exog_col_names   = self.exog_col_names,
        interval         = None,
        alpha            = None,
        max_steps        = None,
        levels           = levels,
        series_col_names = self.series_col_names
    )

    if exog is not None:
        if isinstance(exog, pd.DataFrame):
            exog = transform_dataframe(
                       df                = exog,
                       transformer       = self.transformer_exog,
                       fit               = False,
                       inverse_transform = False
                   )
        else:
            exog = transform_series(
                       series            = exog,
                       transformer       = self.transformer_exog,
                       fit               = False,
                       inverse_transform = False
                   )
        exog_values = exog.iloc[:steps, ].to_numpy()
    else:
        exog_values = None

    boot_predictions = {}

    for level in levels:

        last_window_level = transform_series(
                                series            = last_window[level],
                                transformer       = self.transformer_series_[level],
                                fit               = False,
                                inverse_transform = False
                            )
        last_window_values, last_window_index = preprocess_last_window(
                                                    last_window = last_window_level
                                                )

        level_boot_predictions = np.full(
                                     shape      = (steps, n_boot),
                                     fill_value = np.nan,
                                     dtype      = float
                                 )
        rng = np.random.default_rng(seed=random_state)
        seeds = rng.integers(low=0, high=10000, size=n_boot)

        residuals = residuals_levels[level]

        for i in range(n_boot):
            # In each bootstraping iteration the initial last_window and exog 
            # need to be restored.
            last_window_boot = last_window_values.copy()
            exog_boot = exog_values.copy() if exog is not None else None

            rng = np.random.default_rng(seed=seeds[i])
            sample_residuals = rng.choice(
                                   a       = residuals,
                                   size    = steps,
                                   replace = True
                               )

            for step in range(steps):

                prediction = self._recursive_predict(
                                 steps       = 1,
                                 level       = level,
                                 last_window = last_window_boot,
                                 exog        = exog_boot 
                             )

                prediction_with_residual = prediction + sample_residuals[step]
                level_boot_predictions[step, i] = prediction_with_residual

                last_window_boot = np.append(
                                       last_window_boot[1:],
                                       prediction_with_residual
                                   )
                if exog is not None:
                    exog_boot = exog_boot[1:]

        level_boot_predictions = pd.DataFrame(
                                     data    = level_boot_predictions,
                                     index   = expand_index(last_window_index, steps=steps),
                                     columns = [f"pred_boot_{i}" for i in range(n_boot)]
                                 )

        if self.transformer_series_[level]:
            for col in level_boot_predictions.columns:
                level_boot_predictions[col] = transform_series(
                                                  series            = level_boot_predictions[col],
                                                  transformer       = self.transformer_series_[level],
                                                  fit               = False,
                                                  inverse_transform = True
                                              )

        boot_predictions[level] = level_boot_predictions

    return boot_predictions

`predict_dist(self, steps, distribution, levels=None, last_window=None, exog=None, n_boot=500, random_state=123, in_sample_residuals=True)` ¶

Fit a given probability distribution for each step. After generating

multiple forecasting predictions through a bootstrapping process, each step is fitted to the given distribution.

Parameters:

Name	Type	Description	Default
`steps`	`int`	Number of future steps predicted.	required
`distribution`	`object`	A distribution object from scipy.stats.	required
`levels`	`Union[str, list]`	Time series to be predicted. If `None` all levels will be predicted. New in version 0.6.0	`None`
`last_window`	`Optional[pandas.core.frame.DataFrame]`	Series values used to create the predictors (lags) needed in the first iteration of the prediction (t + 1). If `last_window = None`, the values stored in`self.last_window` are used to calculate the initial predictors, and the predictions start right after training data.	`None`
`exog`	`Union[pandas.core.series.Series, pandas.core.frame.DataFrame]`	Exogenous variable/s included as predictor/s.	`None`
`n_boot`	`int`	Number of bootstrapping iterations used to estimate prediction intervals.	`500`
`random_state`	`int`	Sets a seed to the random generator, so that boot intervals are always deterministic.	`123`
`in_sample_residuals`	`bool`	If `True`, residuals from the training data are used as proxy of prediction error to create prediction intervals. If `False`, out of sample residuals are used. In the latter case, the user should have calculated and stored the residuals within the forecaster (see `set_out_sample_residuals()`).	`True`

Returns:

Type	Description
`DataFrame`	Distribution parameters estimated for each step and level.

Source code in skforecast/ForecasterAutoregMultiSeries/ForecasterAutoregMultiSeries.py

def predict_dist(
    self,
    steps: int,
    distribution: object,
    levels: Optional[Union[str, list]]=None,
    last_window: Optional[pd.DataFrame]=None,
    exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
    n_boot: int=500,
    random_state: int=123,
    in_sample_residuals: bool=True
) -> pd.DataFrame:
    """
    Fit a given probability distribution for each step. After generating 
    multiple forecasting predictions through a bootstrapping process, each 
    step is fitted to the given distribution.

    Parameters
    ---------- 
    steps : int
        Number of future steps predicted.

    distribution : Object
        A distribution object from scipy.stats.

    levels : str, list, default `None`
        Time series to be predicted. If `None` all levels will be predicted.  
        **New in version 0.6.0**  

    last_window : pandas DataFrame, default `None`
        Series values used to create the predictors (lags) needed in the 
        first iteration of the prediction (t + 1).

        If `last_window = None`, the values stored in` self.last_window` are
        used to calculate the initial predictors, and the predictions start
        right after training data.

    exog : pandas Series, pandas DataFrame, default `None`
        Exogenous variable/s included as predictor/s.

    n_boot : int, default `500`
        Number of bootstrapping iterations used to estimate prediction
        intervals.

    random_state : int, default `123`
        Sets a seed to the random generator, so that boot intervals are always 
        deterministic.

    in_sample_residuals : bool, default `True`
        If `True`, residuals from the training data are used as proxy of
        prediction error to create prediction intervals. If `False`, out of
        sample residuals are used. In the latter case, the user should have
        calculated and stored the residuals within the forecaster (see
        `set_out_sample_residuals()`).

    Returns 
    -------
    predictions : pandas DataFrame
        Distribution parameters estimated for each step and level.

    """

    if levels is None:
        levels = self.series_col_names
    elif isinstance(levels, str):
        levels = [levels]

    boot_samples = self.predict_bootstrapping(
                       steps               = steps,
                       levels              = levels,
                       last_window         = last_window,
                       exog                = exog,
                       n_boot              = n_boot,
                       random_state        = random_state,
                       in_sample_residuals = in_sample_residuals
                   )

    param_names = [p for p in inspect.signature(distribution._pdf).parameters if not p=='x'] + ["loc","scale"]
    predictions = []

    for level in levels:
        param_values = np.apply_along_axis(lambda x: distribution.fit(x), axis=1, arr=boot_samples[level])
        level_param_names = [f'{level}_{p}' for p in param_names]

        pred_level = pd.DataFrame(
                         data    = param_values,
                         columns = level_param_names,
                         index   = boot_samples[level].index
                     )

        predictions.append(pred_level)

    predictions = pd.concat(predictions, axis=1)

    return predictions

`predict_interval(self, steps, levels=None, last_window=None, exog=None, interval=[5, 95], n_boot=500, random_state=123, in_sample_residuals=True)` ¶

Iterative process in which, each prediction, is used as a predictor

for the next step and bootstrapping is used to estimate prediction intervals. Both predictions and intervals are returned.

Parameters:

Name	Type	Description	Default
`steps`	`int`	Number of future steps predicted.	required
`levels`	`Union[str, list]`	Time series to be predicted. If `None` all levels will be predicted. New in version 0.6.0	`None`
`last_window`	`Optional[pandas.core.frame.DataFrame]`	Series values used to create the predictors (lags) needed in the first iteration of the prediction (t + 1). If `last_window = None`, the values stored in`self.last_window` are used to calculate the initial predictors, and the predictions start right after training data.	`None`
`exog`	`Union[pandas.core.series.Series, pandas.core.frame.DataFrame]`	Exogenous variable/s included as predictor/s.	`None`
`interval`	`list`	Confidence of the prediction interval estimated. Sequence of percentiles to compute, which must be between 0 and 100 inclusive. For example, interval of 95% should be as `interval = [2.5, 97.5]`.	`[5, 95]`
`n_boot`	`int`	Number of bootstrapping iterations used to estimate prediction intervals.	`500`
`random_state`	`int`	Sets a seed to the random generator, so that boot intervals are always deterministic.	`123`
`in_sample_residuals`	`bool`	If `True`, residuals from the training data are used as proxy of prediction error to create prediction intervals. If `False`, out of sample residuals are used. In the latter case, the user should have calculated and stored the residuals within the forecaster (see `set_out_sample_residuals()`).	`True`

Returns:

Type	Description
`DataFrame`	Values predicted by the forecaster and their estimated interval. level: predictions. level_lower_bound: lower bound of the interval. level_upper_bound: upper bound interval of the interval.

Source code in skforecast/ForecasterAutoregMultiSeries/ForecasterAutoregMultiSeries.py

def predict_interval(
    self,
    steps: int,
    levels: Optional[Union[str, list]]=None,
    last_window: Optional[pd.DataFrame]=None,
    exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
    interval: list=[5, 95],
    n_boot: int=500,
    random_state: int=123,
    in_sample_residuals: bool=True
) -> pd.DataFrame:
    """
    Iterative process in which, each prediction, is used as a predictor
    for the next step and bootstrapping is used to estimate prediction
    intervals. Both predictions and intervals are returned.

    Parameters
    ---------- 
    steps : int
        Number of future steps predicted.

    levels : str, list, default `None`
        Time series to be predicted. If `None` all levels will be predicted.  
        **New in version 0.6.0**  

    last_window : pandas DataFrame, default `None`
        Series values used to create the predictors (lags) needed in the 
        first iteration of the prediction (t + 1).

        If `last_window = None`, the values stored in` self.last_window` are
        used to calculate the initial predictors, and the predictions start
        right after training data.

    exog : pandas Series, pandas DataFrame, default `None`
        Exogenous variable/s included as predictor/s.

    interval : list, default `[5, 95]`
        Confidence of the prediction interval estimated. Sequence of 
        percentiles to compute, which must be between 0 and 100 inclusive. 
        For example, interval of 95% should be as `interval = [2.5, 97.5]`.

    n_boot : int, default `500`
        Number of bootstrapping iterations used to estimate prediction
        intervals.

    random_state : int, default `123`
        Sets a seed to the random generator, so that boot intervals are always 
        deterministic.

    in_sample_residuals : bool, default `True`
        If `True`, residuals from the training data are used as proxy of
        prediction error to create prediction intervals. If `False`, out of
        sample residuals are used. In the latter case, the user should have
        calculated and stored the residuals within the forecaster (see
        `set_out_sample_residuals()`).

    Returns 
    -------
    predictions : pandas DataFrame
        Values predicted by the forecaster and their estimated interval.
            level: predictions.
            level_lower_bound: lower bound of the interval.
            level_upper_bound: upper bound interval of the interval.

    Notes
    -----
    More information about prediction intervals in forecasting:
    https://otexts.com/fpp2/prediction-intervals.html
    Forecasting: Principles and Practice (2nd ed) Rob J Hyndman and
    George Athanasopoulos.

    """

    if levels is None:
        levels = self.series_col_names
    elif isinstance(levels, str):
        levels = [levels]

    check_interval(interval=interval)

    preds = self.predict(
                steps       = steps,
                levels      = levels,
                last_window = last_window,
                exog        = exog
            )

    boot_predictions = self.predict_bootstrapping(
                           steps               = steps,
                           levels              = levels,
                           last_window         = last_window,
                           exog                = exog,
                           n_boot              = n_boot,
                           random_state        = random_state,
                           in_sample_residuals = in_sample_residuals
                       )

    interval = np.array(interval)/100
    predictions = []

    for level in levels:
        preds_interval = boot_predictions[level].quantile(q=interval, axis=1).transpose()
        preds_interval.columns = [f'{level}_lower_bound', f'{level}_upper_bound']
        predictions.append(preds[level])
        predictions.append(preds_interval)

    predictions = pd.concat(predictions, axis=1)

    return predictions

`set_fit_kwargs(self, fit_kwargs)` ¶

Set new values for the additional keyword arguments passed to the fit

method of the regressor.

Parameters:

Name	Type	Description	Default
`fit_kwargs`	`dict`	Dict of the form {"argument": new_value}.	required

Source code in skforecast/ForecasterAutoregMultiSeries/ForecasterAutoregMultiSeries.py

def set_fit_kwargs(
    self, 
    fit_kwargs: dict
) -> None:
    """
    Set new values for the additional keyword arguments passed to the `fit` 
    method of the regressor.

    Parameters
    ----------
    fit_kwargs : dict
        Dict of the form {"argument": new_value}.

    Returns 
    -------
    None

    """

    self.fit_kwargs = check_select_fit_kwargs(self.regressor, fit_kwargs=fit_kwargs)

`set_lags(self, lags)` ¶

Set new value to the attribute lags.

Attributes max_lag and window_size are also updated.

Parameters:

Name	Type	Description	Default
`lags`	`Union[int, list, numpy.ndarray, range]`	Lags used as predictors. Index starts at 1, so lag 1 is equal to t-1. `int`: include lags from 1 to `lags`. `list` or `np.array`: include only lags present in `lags`.	required

Source code in skforecast/ForecasterAutoregMultiSeries/ForecasterAutoregMultiSeries.py

def set_lags(
    self, 
    lags: Union[int, list, np.ndarray, range]
) -> None:
    """      
    Set new value to the attribute `lags`.
    Attributes `max_lag` and `window_size` are also updated.

    Parameters
    ----------
    lags : int, list, 1D np.array, range
        Lags used as predictors. Index starts at 1, so lag 1 is equal to t-1.
            `int`: include lags from 1 to `lags`.
            `list` or `np.array`: include only lags present in `lags`.

    Returns 
    -------
    None

    """

    self.lags = initialize_lags(type(self).__name__, lags)            
    self.max_lag  = max(self.lags)
    self.window_size = max(self.lags)

`set_out_sample_residuals(self, residuals, append=True, transform=True, random_state=123)` ¶

Set new values to the attribute out_sample_residuals. Out of sample

residuals are meant to be calculated using observations that did not participate in the training process.

Parameters:

Name	Type	Description	Default
`residuals`	`dict`	Dictionary of numpy ndarrays with the residuals of each level in the form {level: residuals}. If len(residuals) > 1000, only a random sample of 1000 values are stored. Keys must be the same as `levels`.	required
`append`	`bool`	If `True`, new residuals are added to the once already stored in the attribute `out_sample_residuals`. Once the limit of 1000 values is reached, no more values are appended. If False, `out_sample_residuals` is overwritten with the new residuals.	`True`
`transform`	`bool`	If `True`, new residuals are transformed using self.transformer_series.	`True`
`random_state`	`int`	Sets a seed to the random sampling for reproducible output.	`123`

Source code in skforecast/ForecasterAutoregMultiSeries/ForecasterAutoregMultiSeries.py

def set_out_sample_residuals(
    self, 
    residuals: dict,
    append: bool=True,
    transform: bool=True,
    random_state: int=123
)-> None:
    """
    Set new values to the attribute `out_sample_residuals`. Out of sample
    residuals are meant to be calculated using observations that did not
    participate in the training process.

    Parameters
    ----------
    residuals : dict
        Dictionary of numpy ndarrays with the residuals of each level in the
        form {level: residuals}. If len(residuals) > 1000, only a random 
        sample of 1000 values are stored. Keys must be the same as `levels`.

    append : bool, default `True`
        If `True`, new residuals are added to the once already stored in the
        attribute `out_sample_residuals`. Once the limit of 1000 values is
        reached, no more values are appended. If False, `out_sample_residuals`
        is overwritten with the new residuals.

    transform : bool, default `True`
        If `True`, new residuals are transformed using self.transformer_series.

    random_state : int, default `123`
        Sets a seed to the random sampling for reproducible output.

    Returns 
    -------
    self

    """

    if not isinstance(residuals, dict) or not all(isinstance(x, np.ndarray) for x in residuals.values()):
        raise TypeError(
            f"`residuals` argument must be a dict of numpy ndarrays in the form "
            "`{level: residuals}`. " 
            f"Got {type(residuals)}."
        )

    if not self.fitted:
        raise sklearn.exceptions.NotFittedError(
            ("This forecaster is not fitted yet. Call `fit` with appropriate "
             "arguments before using `set_out_sample_residuals()`.")
        )

    if self.out_sample_residuals is None:
        self.out_sample_residuals = {level: None for level in self.series_col_names}

    if not set(self.out_sample_residuals.keys()).issubset(set(residuals.keys())):
        warnings.warn(
            (f"""
            Only residuals of levels 
            {set(self.out_sample_residuals.keys()).intersection(set(residuals.keys()))} 
            are updated.
            """), IgnoredArgumentWarning
        )

    residuals = {key: value for key, value in residuals.items() if key in self.out_sample_residuals.keys()}

    for level, value in residuals.items():

        residuals_level = value

        if not transform and self.transformer_series_[level] is not None:
            warnings.warn(
                ("Argument `transform` is set to `False` but forecaster was "
                f"trained using a transformer {self.transformer_series_[level]} "
                f"for level {level}. Ensure that the new residuals are "
                 "already transformed or set `transform=True`.")
            )

        if transform and self.transformer_series_ and self.transformer_series_[level]:
            warnings.warn(
                ("Residuals will be transformed using the same transformer used "
                f"when training the forecaster for level {level} : "
                f"({self.transformer_series_[level]}). Ensure that the new "
                 "residuals are on the same scale as the original time series.")
            )
            residuals_level = transform_series(
                                  series            = pd.Series(residuals_level, name='residuals'),
                                  transformer       = self.transformer_series_[level],
                                  fit               = False,
                                  inverse_transform = False
                              ).to_numpy()

        if len(residuals_level) > 1000:
            rng = np.random.default_rng(seed=random_state)
            residuals_level = rng.choice(a=residuals_level, size=1000, replace=False)

        if append and self.out_sample_residuals[level] is not None:
            free_space = max(0, 1000 - len(self.out_sample_residuals[level]))
            if len(residuals_level) < free_space:
                residuals_level = np.hstack((
                                        self.out_sample_residuals[level],
                                        residuals_level
                                    ))
            else:
                residuals_level = np.hstack((
                                        self.out_sample_residuals[level],
                                        residuals_level[:free_space]
                                    ))

        self.out_sample_residuals[level] = residuals_level

`set_params(self, params)` ¶

Set new values to the parameters of the scikit learn model stored in the

forecaster.

Parameters:

Name	Type	Description	Default
`params`	`dict`	Parameters values.	required

Source code in skforecast/ForecasterAutoregMultiSeries/ForecasterAutoregMultiSeries.py

def set_params(
    self, 
    params: dict
) -> None:
    """
    Set new values to the parameters of the scikit learn model stored in the
    forecaster.

    Parameters
    ----------
    params : dict
        Parameters values.

    Returns 
    -------
    self

    """

    self.regressor = clone(self.regressor)
    self.regressor.set_params(**params)

`_create_lags(self, y)` `private` ¶

Transforms a 1d array into a 2d array (X) and a 1d array (y). Each row

in X is associated with a value of y and it represents the lags that precede it.

Notice that, the returned matrix X_data, contains the lag 1 in the first column, the lag 2 in the second column and so on.

Parameters:

Name	Type	Description	Default
`y`	`ndarray`	Training time series.	required

Returns:

Type	Description
`Tuple[numpy.ndarray, numpy.ndarray]`	2d numpy array with the lagged values (predictors).

Source code in skforecast/ForecasterAutoregMultiSeries/ForecasterAutoregMultiSeries.py

def _create_lags(
    self, 
    y: np.ndarray
) -> Tuple[np.ndarray, np.ndarray]:
    """       
    Transforms a 1d array into a 2d array (X) and a 1d array (y). Each row
    in X is associated with a value of y and it represents the lags that
    precede it.

    Notice that, the returned matrix X_data, contains the lag 1 in the first
    column, the lag 2 in the second column and so on.

    Parameters
    ----------        
    y : 1d numpy ndarray
        Training time series.

    Returns 
    -------
    X_data : 2d numpy ndarray, shape (samples - max(self.lags), len(self.lags))
        2d numpy array with the lagged values (predictors).

    y_data : 1d numpy ndarray, shape (samples - max(self.lags),)
        Values of the time series related to each row of `X_data`.

    """

    n_splits = len(y) - self.max_lag
    if n_splits <= 0:
        raise ValueError(
            f"The maximum lag ({self.max_lag}) must be less than the length "
            f"of the series ({len(y)})."
        )

    X_data = np.full(shape=(n_splits, len(self.lags)), fill_value=np.nan, dtype=float)

    for i, lag in enumerate(self.lags):
        X_data[:, i] = y[self.max_lag - lag: -lag]

    y_data = y[self.max_lag:]

    return X_data, y_data

`_recursive_predict(self, steps, level, last_window, exog=None)` `private` ¶

Predict n steps ahead. It is an iterative process in which, each prediction,

is used as a predictor for the next step.

Parameters:

Name	Type	Description	Default
`steps`	`int`	Number of future steps predicted.	required
`level`	`str`	Time series to be predicted.	required
`last_window`	`ndarray`	Series values used to create the predictors (lags) needed in the first iteration of the prediction (t + 1).	required
`exog`	`Optional[numpy.ndarray]`	Exogenous variable/s included as predictor/s.	`None`

Returns:

Type	Description
`ndarray`	Predicted values.

Source code in skforecast/ForecasterAutoregMultiSeries/ForecasterAutoregMultiSeries.py

def _recursive_predict(
    self,
    steps: int,
    level: str,
    last_window: np.ndarray,
    exog: Optional[np.ndarray]=None
) -> np.ndarray:
    """
    Predict n steps ahead. It is an iterative process in which, each prediction,
    is used as a predictor for the next step.

    Parameters
    ----------
    steps : int
        Number of future steps predicted.

    level : str
        Time series to be predicted.

    last_window : numpy ndarray
        Series values used to create the predictors (lags) needed in the 
        first iteration of the prediction (t + 1).

    exog : numpy ndarray, default `None`
        Exogenous variable/s included as predictor/s.

    Returns 
    -------
    predictions : numpy ndarray
        Predicted values.

    """

    predictions = np.full(shape=steps, fill_value=np.nan)

    for i in range(steps):
        X = last_window[-self.lags].reshape(1, -1)
        if exog is not None:
            X = np.column_stack((X, exog[i, ].reshape(1, -1)))

        levels_dummies = np.zeros(shape=(1, len(self.series_col_names)), dtype=float)
        levels_dummies[0][self.series_col_names.index(level)] = 1.

        X = np.column_stack((X, levels_dummies.reshape(1, -1)))

        with warnings.catch_warnings():
            # Suppress scikit-learn warning: "X does not have valid feature names,
            # but NoOpTransformer was fitted with feature names".
            warnings.simplefilter("ignore")
            prediction = self.regressor.predict(X)
            predictions[i] = prediction.ravel()[0]

        # Update `last_window` values. The first position is discarded and 
        # the new prediction is added at the end.
        last_window = np.append(last_window[1:], prediction)

    return predictions

`repr(self)` `special` ¶

Information displayed when a ForecasterAutoregMultiSeries object is printed.

Source code in skforecast/ForecasterAutoregMultiSeries/ForecasterAutoregMultiSeries.py

def __repr__(
    self
) -> str:
    """
    Information displayed when a ForecasterAutoregMultiSeries object is printed.
    """

    if isinstance(self.regressor, sklearn.pipeline.Pipeline):
        name_pipe_steps = tuple(name + "__" for name in self.regressor.named_steps.keys())
        params = {key : value for key, value in self.regressor.get_params().items() \
                  if key.startswith(name_pipe_steps)}
    else:
        params = self.regressor.get_params()

    info = (
        f"{'=' * len(type(self).__name__)} \n"
        f"{type(self).__name__} \n"
        f"{'=' * len(type(self).__name__)} \n"
        f"Regressor: {self.regressor} \n"
        f"Lags: {self.lags} \n"
        f"Transformer for series: {self.transformer_series} \n"
        f"Transformer for exog: {self.transformer_exog} \n"
        f"Window size: {self.window_size} \n"
        f"Series levels (names): {self.series_col_names} \n"
        f"Series weights: {self.series_weights} \n"
        f"Weight function included: {True if self.weight_func is not None else False} \n"
        f"Exogenous included: {self.included_exog} \n"
        f"Type of exogenous variable: {self.exog_type} \n"
        f"Exogenous variables names: {self.exog_col_names} \n"
        f"Training range: {self.training_range.to_list() if self.fitted else None} \n"
        f"Training index type: {str(self.index_type).split('.')[-1][:-2] if self.fitted else None} \n"
        f"Training index frequency: {self.index_freq if self.fitted else None} \n"
        f"Regressor parameters: {params} \n"
        f"fit_kwargs: {self.fit_kwargs} \n"
        f"Creation date: {self.creation_date} \n"
        f"Last fit date: {self.fit_date} \n"
        f"Skforecast version: {self.skforcast_version} \n"
        f"Python version: {self.python_version} \n"
        f"Forecaster id: {self.forecaster_id} \n"
    )

    return info

ForecasterAutoregMultiSeries¶