`ForecasterAutoreg`¶

`ForecasterAutoreg (ForecasterBase)` ¶

This class turns any regressor compatible with the scikit-learn API into a

recursive autoregressive (multi-step) forecaster.

Parameters:

Name	Type	Description	Default
`regressor`	`object`	An instance of a regressor or pipeline compatible with the scikit-learn API.	required
`lags`	`Union[int, numpy.ndarray, list]`	Lags used as predictors. Index starts at 1, so lag 1 is equal to t-1. `int`: include lags from 1 to `lags` (included). `list`, `numpy ndarray` or `range`: include only lags present in `lags`, all elements must be int.	required
`transformer_y`	`Optional[object]`	An instance of a transformer (preprocessor) compatible with the scikit-learn preprocessing API with methods: fit, transform, fit_transform and inverse_transform. ColumnTransformers are not allowed since they do not have inverse_transform method. The transformation is applied to `y` before training the forecaster.	`None`
`transformer_exog`	`Optional[object]`	An instance of a transformer (preprocessor) compatible with the scikit-learn preprocessing API. The transformation is applied to `exog` before training the forecaster. `inverse_transform` is not available when using ColumnTransformers.	`None`
`weight_func`	`Optional[Callable]`	Function that defines the individual weights for each sample based on the index. For example, a function that assigns a lower weight to certain dates. Ignored if `regressor` does not have the argument `sample_weight` in its `fit` method. The resulting `sample_weight` cannot have negative values.	`None`
`fit_kwargs`	`Optional[dict]`	Additional arguments to be passed to the `fit` method of the regressor. New in version 0.8.0	`None`
`forecaster_id`	`Union[str, int]`	Name used as an identifier of the forecaster. New in version 0.7.0	`None`

Attributes:

Name	Type	Description
`regressor`	`regressor or pipeline compatible with the scikit-learn API`	An instance of a regressor or pipeline compatible with the scikit-learn API.
`lags`	`numpy ndarray`	Lags used as predictors.
`transformer_y`	object transformer (preprocessor), default `None`	An instance of a transformer (preprocessor) compatible with the scikit-learn preprocessing API with methods: fit, transform, fit_transform and inverse_transform. ColumnTransformers are not allowed since they do not have inverse_transform method. The transformation is applied to `y` before training the forecaster.
`transformer_exog`	object transformer (preprocessor), default `None`	An instance of a transformer (preprocessor) compatible with the scikit-learn preprocessing API. The transformation is applied to `exog` before training the forecaster. `inverse_transform` is not available when using ColumnTransformers.
`weight_func`	`Callable`	Function that defines the individual weights for each sample based on the index. For example, a function that assigns a lower weight to certain dates. Ignored if `regressor` does not have the argument `sample_weight` in its `fit` method. New in version 0.6.0
`source_code_weight_func`	`str`	Source code of the custom function used to create weights. New in version 0.6.0
`max_lag`	`int`	Maximum value of lag included in `lags`.
`window_size`	`int`	Size of the window needed to create the predictors. It is equal to `max_lag`.
`last_window`	`pandas Series`	Last window the forecaster has seen during training. It stores the values needed to predict the next `step` immediately after the training data.
`index_type`	`type`	Type of index of the input used in training.
`index_freq`	`str`	Frequency of Index of the input used in training.
`training_range`	`pandas Index`	First and last values of index of the data used during training.
`included_exog`	`bool`	If the forecaster has been trained using exogenous variable/s.
`exog_type`	`type`	Type of exogenous data (pandas Series or DataFrame) used in training.
`exog_dtypes`	`dict`	Type of each exogenous variable/s used in training. If `transformer_exog` is used, the dtypes are calculated after the transformation.
`exog_col_names`	`list`	Names of columns of `exog` if `exog` used in training was a pandas DataFrame.
`X_train_col_names`	`list`	Names of columns of the matrix created internally for training.
`fit_kwargs`	`dict`	Additional arguments to be passed to the `fit` method of the regressor. New in version 0.8.0
`in_sample_residuals`	`numpy ndarray`	Residuals of the model when predicting training data. Only stored up to 1000 values. If `transformer_y` is not `None`, residuals are stored in the transformed scale.
`out_sample_residuals`	`numpy ndarray`	Residuals of the model when predicting non training data. Only stored up to 1000 values. If `transformer_y` is not `None`, residuals are assumed to be in the transformed scale. Use `set_out_sample_residuals` method to set values.
`fitted`	`bool`	Tag to identify if the regressor has been fitted (trained).
`creation_date`	`str`	Date of creation.
`fit_date`	`str`	Date of last fit.
`skforcast_version`	`str`	Version of skforecast library used to create the forecaster.
`python_version`	`str`	Version of python used to create the forecaster.
`forecaster_id`	str, int default `None`	Name used as an identifier of the forecaster. New in version 0.7.0

Source code in skforecast/ForecasterAutoreg/ForecasterAutoreg.py

class ForecasterAutoreg(ForecasterBase):
    """
    This class turns any regressor compatible with the scikit-learn API into a
    recursive autoregressive (multi-step) forecaster.

    Parameters
    ----------
    regressor : regressor or pipeline compatible with the scikit-learn API
        An instance of a regressor or pipeline compatible with the scikit-learn API.

    lags : int, list, 1d numpy ndarray, range
        Lags used as predictors. Index starts at 1, so lag 1 is equal to t-1.
            `int`: include lags from 1 to `lags` (included).
            `list`, `numpy ndarray` or `range`: include only lags present in `lags`,
            all elements must be int.

    transformer_y : object transformer (preprocessor), default `None`
        An instance of a transformer (preprocessor) compatible with the scikit-learn
        preprocessing API with methods: fit, transform, fit_transform and inverse_transform.
        ColumnTransformers are not allowed since they do not have inverse_transform method.
        The transformation is applied to `y` before training the forecaster. 

    transformer_exog : object transformer (preprocessor), default `None`
        An instance of a transformer (preprocessor) compatible with the scikit-learn
        preprocessing API. The transformation is applied to `exog` before training the
        forecaster. `inverse_transform` is not available when using ColumnTransformers.

    weight_func : Callable, default `None`
        Function that defines the individual weights for each sample based on the
        index. For example, a function that assigns a lower weight to certain dates.
        Ignored if `regressor` does not have the argument `sample_weight` in its `fit`
        method. The resulting `sample_weight` cannot have negative values.

    fit_kwargs : dict, default `None`
        Additional arguments to be passed to the `fit` method of the regressor.
        **New in version 0.8.0**

    forecaster_id : str, int, default `None`
        Name used as an identifier of the forecaster.
        **New in version 0.7.0**

    Attributes
    ----------
    regressor : regressor or pipeline compatible with the scikit-learn API
        An instance of a regressor or pipeline compatible with the scikit-learn API.

    lags : numpy ndarray
        Lags used as predictors.

    transformer_y : object transformer (preprocessor), default `None`
        An instance of a transformer (preprocessor) compatible with the scikit-learn
        preprocessing API with methods: fit, transform, fit_transform and inverse_transform.
        ColumnTransformers are not allowed since they do not have inverse_transform method.
        The transformation is applied to `y` before training the forecaster.

    transformer_exog : object transformer (preprocessor), default `None`
        An instance of a transformer (preprocessor) compatible with the scikit-learn
        preprocessing API. The transformation is applied to `exog` before training the
        forecaster. `inverse_transform` is not available when using ColumnTransformers.

    weight_func : Callable
        Function that defines the individual weights for each sample based on the
        index. For example, a function that assigns a lower weight to certain dates.
        Ignored if `regressor` does not have the argument `sample_weight` in its `fit`
        method.
        **New in version 0.6.0**

    source_code_weight_func : str
        Source code of the custom function used to create weights.
        **New in version 0.6.0**

    max_lag : int
        Maximum value of lag included in `lags`.

    window_size : int
        Size of the window needed to create the predictors. It is equal to
        `max_lag`.

    last_window : pandas Series
        Last window the forecaster has seen during training. It stores the
        values needed to predict the next `step` immediately after the training data.

    index_type : type
        Type of index of the input used in training.

    index_freq : str
        Frequency of Index of the input used in training.

    training_range : pandas Index
        First and last values of index of the data used during training.

    included_exog : bool
        If the forecaster has been trained using exogenous variable/s.

    exog_type : type
        Type of exogenous data (pandas Series or DataFrame) used in training.

    exog_dtypes : dict
        Type of each exogenous variable/s used in training. If `transformer_exog` 
        is used, the dtypes are calculated after the transformation.

    exog_col_names : list
        Names of columns of `exog` if `exog` used in training was a pandas
        DataFrame.

    X_train_col_names : list
        Names of columns of the matrix created internally for training.

    fit_kwargs : dict
        Additional arguments to be passed to the `fit` method of the regressor.
        **New in version 0.8.0**

    in_sample_residuals : numpy ndarray
        Residuals of the model when predicting training data. Only stored up to
        1000 values. If `transformer_y` is not `None`, residuals are stored in the
        transformed scale.

    out_sample_residuals : numpy ndarray
        Residuals of the model when predicting non training data. Only stored
        up to 1000 values. If `transformer_y` is not `None`, residuals
        are assumed to be in the transformed scale. Use `set_out_sample_residuals` 
        method to set values.

    fitted : bool
        Tag to identify if the regressor has been fitted (trained).

    creation_date : str
        Date of creation.

    fit_date : str
        Date of last fit.

    skforcast_version : str
        Version of skforecast library used to create the forecaster.

    python_version : str
        Version of python used to create the forecaster.

    forecaster_id : str, int default `None`
        Name used as an identifier of the forecaster.
        **New in version 0.7.0**

    """

    def __init__(
        self,
        regressor: object,
        lags: Union[int, np.ndarray, list],
        transformer_y: Optional[object]=None,
        transformer_exog: Optional[object]=None,
        weight_func: Optional[Callable]=None,
        fit_kwargs: Optional[dict]=None,
        forecaster_id: Optional[Union[str, int]]=None
    ) -> None:

        self.regressor               = regressor
        self.transformer_y           = transformer_y
        self.transformer_exog        = transformer_exog
        self.weight_func             = weight_func
        self.source_code_weight_func = None
        self.last_window             = None
        self.index_type              = None
        self.index_freq              = None
        self.training_range          = None
        self.included_exog           = False
        self.exog_type               = None
        self.exog_dtypes             = None
        self.exog_col_names          = None
        self.X_train_col_names       = None
        self.in_sample_residuals     = None
        self.out_sample_residuals    = None
        self.fitted                  = False
        self.creation_date           = pd.Timestamp.today().strftime('%Y-%m-%d %H:%M:%S')
        self.fit_date                = None
        self.skforcast_version       = skforecast.__version__
        self.python_version          = sys.version.split(" ")[0]
        self.forecaster_id           = forecaster_id

        self.lags = initialize_lags(type(self).__name__, lags)
        self.max_lag = max(self.lags)
        self.window_size = self.max_lag

        self.weight_func, self.source_code_weight_func, _ = initialize_weights(
            forecaster_name = type(self).__name__, 
            regressor       = regressor, 
            weight_func     = weight_func, 
            series_weights  = None
        )

        self.fit_kwargs = check_select_fit_kwargs(
                              regressor  = regressor,
                              fit_kwargs = fit_kwargs
                          )


    def __repr__(
        self
    ) -> str:
        """
        Information displayed when a ForecasterAutoreg object is printed.
        """

        if isinstance(self.regressor, sklearn.pipeline.Pipeline):
            name_pipe_steps = tuple(name + "__" for name in self.regressor.named_steps.keys())
            params = {key : value for key, value in self.regressor.get_params().items() \
                      if key.startswith(name_pipe_steps)}
        else:
            params = self.regressor.get_params(deep=True)

        info = (
            f"{'=' * len(type(self).__name__)} \n"
            f"{type(self).__name__} \n"
            f"{'=' * len(type(self).__name__)} \n"
            f"Regressor: {self.regressor} \n"
            f"Lags: {self.lags} \n"
            f"Transformer for y: {self.transformer_y} \n"
            f"Transformer for exog: {self.transformer_exog} \n"
            f"Window size: {self.window_size} \n"
            f"Weight function included: {True if self.weight_func is not None else False} \n"
            f"Exogenous included: {self.included_exog} \n"
            f"Type of exogenous variable: {self.exog_type} \n"
            f"Exogenous variables names: {self.exog_col_names} \n"
            f"Training range: {self.training_range.to_list() if self.fitted else None} \n"
            f"Training index type: {str(self.index_type).split('.')[-1][:-2] if self.fitted else None} \n"
            f"Training index frequency: {self.index_freq if self.fitted else None} \n"
            f"Regressor parameters: {params} \n"
            f"fit_kwargs: {self.fit_kwargs} \n"
            f"Creation date: {self.creation_date} \n"
            f"Last fit date: {self.fit_date} \n"
            f"Skforecast version: {self.skforcast_version} \n"
            f"Python version: {self.python_version} \n"
            f"Forecaster id: {self.forecaster_id} \n"
        )

        return info


    def _create_lags(
        self, 
        y: np.ndarray
    ) -> Tuple[np.ndarray, np.ndarray]:
        """       
        Transforms a 1d array into a 2d array (X) and a 1d array (y). Each row
        in X is associated with a value of y and it represents the lags that
        precede it.

        Notice that, the returned matrix X_data, contains the lag 1 in the first
        column, the lag 2 in the second column and so on.

        Parameters
        ----------        
        y : 1d numpy ndarray
            Training time series.

        Returns 
        -------
        X_data : 2d numpy ndarray, shape (samples - max(self.lags), len(self.lags))
            2d numpy array with the lagged values (predictors).

        y_data : 1d numpy ndarray, shape (samples - max(self.lags),)
            Values of the time series related to each row of `X_data`.

        """

        n_splits = len(y) - self.max_lag
        if n_splits <= 0:
            raise ValueError(
                f"The maximum lag ({self.max_lag}) must be less than the length "
                f"of the series ({len(y)})."
            )

        X_data = np.full(shape=(n_splits, len(self.lags)), fill_value=np.nan, dtype=float)

        for i, lag in enumerate(self.lags):
            X_data[:, i] = y[self.max_lag - lag: -lag]

        y_data = y[self.max_lag:]

        return X_data, y_data


    def create_train_X_y(
        self,
        y: pd.Series,
        exog: Optional[Union[pd.Series, pd.DataFrame]]=None
    ) -> Tuple[pd.DataFrame, pd.Series]:
        """
        Create training matrices from univariate time series and exogenous
        variables.

        Parameters
        ----------        
        y : pandas Series
            Training time series.

        exog : pandas Series, pandas DataFrame, default `None`
            Exogenous variable/s included as predictor/s. Must have the same
            number of observations as `y` and their indexes must be aligned.

        Returns 
        -------
        X_train : pandas DataFrame, shape (len(y) - self.max_lag, len(self.lags))
            Pandas DataFrame with the training values (predictors).

        y_train : pandas Series, shape (len(y) - self.max_lag, )
            Values (target) of the time series related to each row of `X_train`.

        """

        check_y(y=y)
        y = transform_series(
                series            = y,
                transformer       = self.transformer_y,
                fit               = True,
                inverse_transform = False
            )
        y_values, y_index = preprocess_y(y=y)

        if exog is not None:
            if len(exog) != len(y):
                raise ValueError(
                    (f'`exog` must have same number of samples as `y`. '
                     f'length `exog`: ({len(exog)}), length `y`: ({len(y)})')
                )
            check_exog(exog=exog, allow_nan=True)
            if isinstance(exog, pd.Series):
                exog = transform_series(
                           series            = exog,
                           transformer       = self.transformer_exog,
                           fit               = True,
                           inverse_transform = False
                       )
            else:
                exog = transform_dataframe(
                           df                = exog,
                           transformer       = self.transformer_exog,
                           fit               = True,
                           inverse_transform = False
                       )

            check_exog(exog=exog, allow_nan=False)
            check_exog_dtypes(exog)
            self.exog_dtypes = get_exog_dtypes(exog=exog)

            _, exog_index = preprocess_exog(exog=exog, return_values=False)
            if not (exog_index[:len(y_index)] == y_index).all():
                raise ValueError(
                    ("Different index for `y` and `exog`. They must be equal "
                     "to ensure the correct alignment of values.")
                )

        X_train, y_train = self._create_lags(y=y_values)
        X_train_col_names = [f"lag_{i}" for i in self.lags]
        X_train = pd.DataFrame(
                      data    = X_train,
                      columns = X_train_col_names,
                      index   = y_index[self.max_lag: ]
                  )

        if exog is not None:
            # The first `self.max_lag` positions have to be removed from exog
            # since they are not in X_train.
            exog_to_train = exog.iloc[self.max_lag:, ]
            X_train = pd.concat((X_train, exog_to_train), axis=1)

        self.X_train_col_names = X_train.columns.to_list()
        y_train = pd.Series(
                      data  = y_train,
                      index = y_index[self.max_lag: ],
                      name  = 'y'
                  )

        return X_train, y_train


    def create_sample_weights(
        self,
        X_train: pd.DataFrame,
    )-> np.ndarray:
        """
        Crate weights for each observation according to the forecaster's attribute
        `weight_func`.

        Parameters
        ----------
        X_train : pandas DataFrame
            Dataframe generated with the method `create_train_X_y`, first return.

        Returns
        -------
        sample_weight : numpy ndarray
            Weights to use in `fit` method.

        """

        sample_weight = None

        if self.weight_func is not None:
            sample_weight = self.weight_func(X_train.index)

        if sample_weight is not None:
            if np.isnan(sample_weight).any():
                raise ValueError(
                    "The resulting `sample_weight` cannot have NaN values."
                )
            if np.any(sample_weight < 0):
                raise ValueError(
                    "The resulting `sample_weight` cannot have negative values."
                )
            if np.sum(sample_weight) == 0:
                raise ValueError(
                    ("The resulting `sample_weight` cannot be normalized because "
                     "the sum of the weights is zero.")
                )

        return sample_weight


    def fit(
        self,
        y: pd.Series,
        exog: Optional[Union[pd.Series, pd.DataFrame]]=None
    ) -> None:
        """
        Training Forecaster.

        Additional arguments to be passed to the `fit` method of the regressor 
        can be added with the `fit_kwargs` argument when initializing the forecaster.

        Parameters
        ----------        
        y : pandas Series
            Training time series.

        exog : pandas Series, pandas DataFrame, default `None`
            Exogenous variable/s included as predictor/s. Must have the same
            number of observations as `y` and their indexes must be aligned so
            that y[i] is regressed on exog[i].

        Returns 
        -------
        None

        """

        # Reset values in case the forecaster has already been fitted.
        self.index_type          = None
        self.index_freq          = None
        self.last_window         = None
        self.included_exog       = False
        self.exog_type           = None
        self.exog_dtypes         = None
        self.exog_col_names      = None
        self.X_train_col_names   = None
        self.in_sample_residuals = None
        self.fitted              = False
        self.training_range      = None

        if exog is not None:
            self.included_exog = True
            self.exog_type = type(exog)
            self.exog_col_names = \
                 exog.columns.to_list() if isinstance(exog, pd.DataFrame) else exog.name

        X_train, y_train = self.create_train_X_y(y=y, exog=exog)
        sample_weight = self.create_sample_weights(X_train=X_train)

        if sample_weight is not None:
            self.regressor.fit(X=X_train, y=y_train, sample_weight=sample_weight,
                               **self.fit_kwargs)
        else:
            self.regressor.fit(X=X_train, y=y_train, **self.fit_kwargs)

        self.fitted = True
        self.fit_date = pd.Timestamp.today().strftime('%Y-%m-%d %H:%M:%S')
        self.training_range = preprocess_y(y=y, return_values=False)[1][[0, -1]]
        self.index_type = type(X_train.index)
        if isinstance(X_train.index, pd.DatetimeIndex):
            self.index_freq = X_train.index.freqstr
        else: 
            self.index_freq = X_train.index.step

        residuals = (y_train - self.regressor.predict(X_train)).to_numpy()

        if len(residuals) > 1000:
            # Only up to 1000 residuals are stored
            rng = np.random.default_rng(seed=123)
            residuals = rng.choice(
                            a       = residuals, 
                            size    = 1000, 
                            replace = False
                        )

        self.in_sample_residuals = residuals

        # The last time window of training data is stored so that lags needed as
        # predictors in the first iteration of `predict()` can be calculated.
        self.last_window = y.iloc[-self.max_lag:].copy()


    def _recursive_predict(
        self,
        steps: int,
        last_window: np.ndarray,
        exog: Optional[np.ndarray]=None
    ) -> np.ndarray:
        """
        Predict n steps ahead. It is an iterative process in which, each prediction,
        is used as a predictor for the next step.

        Parameters
        ----------
        steps : int
            Number of future steps predicted.

        last_window : numpy ndarray
            Series values used to create the predictors (lags) needed in the 
            first iteration of the prediction (t + 1).

        exog : numpy ndarray, default `None`
            Exogenous variable/s included as predictor/s.

        Returns 
        -------
        predictions : numpy ndarray
            Predicted values.

        """

        predictions = np.full(shape=steps, fill_value=np.nan)

        for i in range(steps):
            X = last_window[-self.lags].reshape(1, -1)
            if exog is not None:
                X = np.column_stack((X, exog[i, ].reshape(1, -1)))
            with warnings.catch_warnings():
                # Suppress scikit-learn warning: "X does not have valid feature names,
                # but NoOpTransformer was fitted with feature names".
                warnings.simplefilter("ignore")
                prediction = self.regressor.predict(X)
                predictions[i] = prediction.ravel()[0]

            # Update `last_window` values. The first position is discarded and 
            # the new prediction is added at the end.
            last_window = np.append(last_window[1:], prediction)

        return predictions


    def predict(
        self,
        steps: int,
        last_window: Optional[pd.Series]=None,
        exog: Optional[Union[pd.Series, pd.DataFrame]]=None
    ) -> pd.Series:
        """
        Predict n steps ahead. It is an recursive process in which, each prediction,
        is used as a predictor for the next step.

        Parameters
        ----------
        steps : int
            Number of future steps predicted.

        last_window : pandas Series, default `None`
            Series values used to create the predictors (lags) needed in the 
            first iteration of the prediction (t + 1).

            If `last_window = None`, the values stored in `self.last_window` are
            used to calculate the initial predictors, and the predictions start
            right after training data.

        exog : pandas Series, pandas DataFrame, default `None`
            Exogenous variable/s included as predictor/s.

        Returns 
        -------
        predictions : pandas Series
            Predicted values.

        """

        if last_window is None:
            last_window = copy(self.last_window)

        check_predict_input(
            forecaster_name  = type(self).__name__,
            steps            = steps,
            fitted           = self.fitted,
            included_exog    = self.included_exog,
            index_type       = self.index_type,
            index_freq       = self.index_freq,
            window_size      = self.window_size,
            last_window      = last_window,
            last_window_exog = None,
            exog             = exog,
            exog_type        = self.exog_type,
            exog_col_names   = self.exog_col_names,
            interval         = None,
            alpha            = None,
            max_steps        = None,
            levels           = None,
            series_col_names = None
        ) 

        if exog is not None:
            if isinstance(exog, pd.DataFrame):
                exog = transform_dataframe(
                           df                = exog,
                           transformer       = self.transformer_exog,
                           fit               = False,
                           inverse_transform = False
                       )
            else:
                exog = transform_series(
                           series            = exog,
                           transformer       = self.transformer_exog,
                           fit               = False,
                           inverse_transform = False
                       )
            check_exog_dtypes(exog=exog)
            exog_values = exog.iloc[:steps, ].to_numpy()
        else:
            exog_values = None

        last_window = transform_series(
                          series            = last_window,
                          transformer       = self.transformer_y,
                          fit               = False,
                          inverse_transform = False
                      )
        last_window_values, last_window_index = preprocess_last_window(
                                                    last_window = last_window
                                                )

        predictions = self._recursive_predict(
                          steps       = steps,
                          last_window = copy(last_window_values),
                          exog        = copy(exog_values)
                      )

        predictions = pd.Series(
                          data  = predictions,
                          index = expand_index(
                                      index = last_window_index,
                                      steps = steps
                                  ),
                          name = 'pred'
                      )

        predictions = transform_series(
                          series            = predictions,
                          transformer       = self.transformer_y,
                          fit               = False,
                          inverse_transform = True
                      )

        return predictions


    def predict_bootstrapping(
        self,
        steps: int,
        last_window: Optional[pd.Series]=None,
        exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
        n_boot: int=500,
        random_state: int=123,
        in_sample_residuals: bool=True
    ) -> pd.DataFrame:
        """
        Generate multiple forecasting predictions using a bootstrapping process. 
        By sampling from a collection of past observed errors (the residuals),
        each iteration of bootstrapping generates a different set of predictions. 
        See the Notes section for more information. 

        Parameters
        ----------   
        steps : int
            Number of future steps predicted.

        last_window : pandas Series, default `None`
            Series values used to create the predictors (lags) needed in the 
            first iteration of the prediction (t + 1).

            If `last_window = None`, the values stored in `self.last_window` are
            used to calculate the initial predictors, and the predictions start
            right after training data.

        exog : pandas Series, pandas DataFrame, default `None`
            Exogenous variable/s included as predictor/s.

        n_boot : int, default `500`
            Number of bootstrapping iterations used to estimate prediction
            intervals.

        random_state : int, default `123`
            Sets a seed to the random generator, so that boot intervals are always 
            deterministic.

        in_sample_residuals : bool, default `True`
            If `True`, residuals from the training data are used as proxy of
            prediction error to create prediction intervals. If `False`, out of
            sample residuals are used. In the latter case, the user should have
            calculated and stored the residuals within the forecaster (see
            `set_out_sample_residuals()`).

        Returns 
        -------
        boot_predictions : pandas DataFrame, shape (steps, n_boot)
            Predictions generated by bootstrapping.

        Notes
        -----
        More information about prediction intervals in forecasting:
        https://otexts.com/fpp3/prediction-intervals.html#prediction-intervals-from-bootstrapped-residuals
        Forecasting: Principles and Practice (3nd ed) Rob J Hyndman and George Athanasopoulos.

        """

        if not in_sample_residuals and self.out_sample_residuals is None:
            raise ValueError(
                ('`forecaster.out_sample_residuals` is `None`. Use '
                 '`in_sample_residuals=True` or method `set_out_sample_residuals()` '
                 'before `predict_interval()`, `predict_bootstrapping()` or '
                 '`predict_dist()`.')
            )

        if last_window is None:
            last_window = copy(self.last_window)

        check_predict_input(
            forecaster_name  = type(self).__name__,
            steps            = steps,
            fitted           = self.fitted,
            included_exog    = self.included_exog,
            index_type       = self.index_type,
            index_freq       = self.index_freq,
            window_size      = self.window_size,
            last_window      = last_window,
            last_window_exog = None,
            exog             = exog,
            exog_type        = self.exog_type,
            exog_col_names   = self.exog_col_names,
            interval         = None,
            alpha            = None,
            max_steps        = None,
            levels           = None,
            series_col_names = None
        )

        if exog is not None:
            if isinstance(exog, pd.DataFrame):
                exog = transform_dataframe(
                           df                = exog,
                           transformer       = self.transformer_exog,
                           fit               = False,
                           inverse_transform = False
                       )
            else:
                exog = transform_series(
                           series            = exog,
                           transformer       = self.transformer_exog,
                           fit               = False,
                           inverse_transform = False
                       )
            exog_values = exog.iloc[:steps, ].to_numpy()
        else:
            exog_values = None

        last_window = transform_series(
                          series            = last_window,
                          transformer       = self.transformer_y,
                          fit               = False,
                          inverse_transform = False
                      )
        last_window_values, last_window_index = preprocess_last_window(
                                                    last_window = last_window
                                                )

        boot_predictions = np.full(
                               shape      = (steps, n_boot),
                               fill_value = np.nan,
                               dtype      = float
                           )
        rng = np.random.default_rng(seed=random_state)
        seeds = rng.integers(low=0, high=10000, size=n_boot)

        if in_sample_residuals:
            residuals = self.in_sample_residuals
        else:
            residuals = self.out_sample_residuals

        for i in range(n_boot):
            # In each bootstraping iteration the initial last_window and exog 
            # need to be restored.
            last_window_boot = last_window_values.copy()
            exog_boot = exog_values.copy() if exog is not None else None

            rng = np.random.default_rng(seed=seeds[i])
            sample_residuals = rng.choice(
                                   a       = residuals,
                                   size    = steps,
                                   replace = True
                               )

            for step in range(steps):

                prediction = self._recursive_predict(
                                 steps       = 1,
                                 last_window = last_window_boot,
                                 exog        = exog_boot 
                             )

                prediction_with_residual  = prediction + sample_residuals[step]
                boot_predictions[step, i] = prediction_with_residual

                last_window_boot = np.append(
                                       last_window_boot[1:],
                                       prediction_with_residual
                                   )

                if exog is not None:
                    exog_boot = exog_boot[1:]

        boot_predictions = pd.DataFrame(
                               data    = boot_predictions,
                               index   = expand_index(last_window_index, steps=steps),
                               columns = [f"pred_boot_{i}" for i in range(n_boot)]
                           )

        if self.transformer_y:
            for col in boot_predictions.columns:
                boot_predictions[col] = transform_series(
                                            series            = boot_predictions[col],
                                            transformer       = self.transformer_y,
                                            fit               = False,
                                            inverse_transform = True
                                        )

        return boot_predictions


    def predict_interval(
        self,
        steps: int,
        last_window: Optional[pd.Series]=None,
        exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
        interval: list=[5, 95],
        n_boot: int=500,
        random_state: int=123,
        in_sample_residuals: bool=True
    ) -> pd.DataFrame:
        """
        Iterative process in which each prediction is used as a predictor
        for the next step, and bootstrapping is used to estimate prediction
        intervals. Both predictions and intervals are returned.

        Parameters
        ---------- 
        steps : int
            Number of future steps predicted.

        last_window : pandas Series, default `None`
            Series values used to create the predictors (lags) needed in the 
            first iteration of the prediction (t + 1).

            If `last_window = None`, the values stored in` self.last_window` are
            used to calculate the initial predictors, and the predictions start
            right after training data.

        exog : pandas Series, pandas DataFrame, default `None`
            Exogenous variable/s included as predictor/s.

        interval : list, default `[5, 95]`
            Confidence of the prediction interval estimated. Sequence of 
            percentiles to compute, which must be between 0 and 100 inclusive. 
            For example, interval of 95% should be as `interval = [2.5, 97.5]`.

        n_boot : int, default `500`
            Number of bootstrapping iterations used to estimate prediction
            intervals.

        random_state : int, default `123`
            Sets a seed to the random generator, so that boot intervals are always 
            deterministic.

        in_sample_residuals : bool, default `True`
            If `True`, residuals from the training data are used as proxy of
            prediction error to create prediction intervals. If `False`, out of
            sample residuals are used. In the latter case, the user should have
            calculated and stored the residuals within the forecaster (see
            `set_out_sample_residuals()`).

        Returns 
        -------
        predictions : pandas DataFrame
            Values predicted by the forecaster and their estimated interval:

            - pred: predictions.
            - lower_bound: lower bound of the interval.
            - upper_bound: upper bound interval of the interval.

        Notes
        -----
        More information about prediction intervals in forecasting:
        https://otexts.com/fpp2/prediction-intervals.html
        Forecasting: Principles and Practice (2nd ed) Rob J Hyndman and
        George Athanasopoulos.

        """

        check_interval(interval=interval)

        predictions = self.predict(
                          steps       = steps,
                          last_window = last_window,
                          exog        = exog
                      )

        boot_predictions = self.predict_bootstrapping(
                               steps               = steps,
                               last_window         = last_window,
                               exog                = exog,
                               n_boot              = n_boot,
                               random_state        = random_state,
                               in_sample_residuals = in_sample_residuals
                           )

        interval = np.array(interval)/100
        predictions_interval = boot_predictions.quantile(q=interval, axis=1).transpose()
        predictions_interval.columns = ['lower_bound', 'upper_bound']
        predictions = pd.concat((predictions, predictions_interval), axis=1)

        return predictions


    def predict_dist(
        self,
        steps: int,
        distribution: object,
        last_window: Optional[pd.Series]=None,
        exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
        n_boot: int=500,
        random_state: int=123,
        in_sample_residuals: bool=True
    ) -> pd.DataFrame:
        """
        Fit a given probability distribution for each step. After generating 
        multiple forecasting predictions through a bootstrapping process, each 
        step is fitted to the given distribution.

        Parameters
        ---------- 
        steps : int
            Number of future steps predicted.

        distribution : Object
            A distribution object from scipy.stats.

        last_window : pandas Series, default `None`
            Series values used to create the predictors (lags) needed in the 
            first iteration of the prediction (t + 1).

            If `last_window = None`, the values stored in` self.last_window` are
            used to calculate the initial predictors, and the predictions start
            right after training data.

        exog : pandas Series, pandas DataFrame, default `None`
            Exogenous variable/s included as predictor/s.

        n_boot : int, default `500`
            Number of bootstrapping iterations used to estimate prediction
            intervals.

        random_state : int, default `123`
            Sets a seed to the random generator, so that boot intervals are always 
            deterministic.

        in_sample_residuals : bool, default `True`
            If `True`, residuals from the training data are used as proxy of
            prediction error to create prediction intervals. If `False`, out of
            sample residuals are used. In the latter case, the user should have
            calculated and stored the residuals within the forecaster (see
            `set_out_sample_residuals()`).

        Returns 
        -------
        predictions : pandas DataFrame
            Distribution parameters estimated for each step.

        """

        boot_samples = self.predict_bootstrapping(
                           steps               = steps,
                           last_window         = last_window,
                           exog                = exog,
                           n_boot              = n_boot,
                           random_state        = random_state,
                           in_sample_residuals = in_sample_residuals
                       )       

        param_names = [p for p in inspect.signature(distribution._pdf).parameters if not p=='x'] + ["loc","scale"]
        param_values = np.apply_along_axis(lambda x: distribution.fit(x), axis=1, arr=boot_samples)
        predictions = pd.DataFrame(
                          data    = param_values,
                          columns = param_names,
                          index   = boot_samples.index
                      )

        return predictions


    def set_params(
        self, 
        params: dict
    ) -> None:
        """
        Set new values to the parameters of the scikit learn model stored in the
        forecaster.

        Parameters
        ----------
        params : dict
            Parameters values.

        Returns 
        -------
        self

        """

        self.regressor = clone(self.regressor)
        self.regressor.set_params(**params)


    def set_fit_kwargs(
        self, 
        fit_kwargs: dict
    ) -> None:
        """
        Set new values for the additional keyword arguments passed to the `fit` 
        method of the regressor.

        Parameters
        ----------
        fit_kwargs : dict
            Dict of the form {"argument": new_value}.

        Returns 
        -------
        None

        """

        self.fit_kwargs = check_select_fit_kwargs(self.regressor, fit_kwargs=fit_kwargs)


    def set_lags(
        self, 
        lags: Union[int, list, np.ndarray, range]
    ) -> None:
        """
        Set new value to the attribute `lags`.
        Attributes `max_lag` and `window_size` are also updated.

        Parameters
        ----------
        lags : int, list, 1D np.ndarray, range
            Lags used as predictors. Index starts at 1, so lag 1 is equal to t-1.
                `int`: include lags from 1 to `lags`.
                `list` or `np.ndarray`: include only lags present in `lags`.

        Returns 
        -------
        None

        """

        self.lags = initialize_lags(type(self).__name__, lags)
        self.max_lag = max(self.lags)
        self.window_size = max(self.lags)


    def set_out_sample_residuals(
        self, 
        residuals: np.ndarray, 
        append: bool=True,
        transform: bool=True,
        random_state: int=123
    )-> None:
        """
        Set new values to the attribute `out_sample_residuals`. Out of sample
        residuals are meant to be calculated using observations that did not
        participate in the training process.

        Parameters
        ----------
        residuals : numpy ndarray
            Values of residuals. If len(residuals) > 1000, only a random sample
            of 1000 values are stored.

        append : bool, default `True`
            If `True`, new residuals are added to the once already stored in the
            attribute `out_sample_residuals`. Once the limit of 1000 values is
            reached, no more values are appended. If False, `out_sample_residuals`
            is overwritten with the new residuals.

        transform : bool, default `True`
            If `True`, new residuals are transformed using self.transformer_y.

        random_state : int, default `123`
            Sets a seed to the random sampling for reproducible output.

        Returns 
        -------
        self

        """

        if not isinstance(residuals, np.ndarray):
            raise TypeError(
                f"`residuals` argument must be `numpy ndarray`. Got {type(residuals)}."
            )

        if not transform and self.transformer_y is not None:
            warnings.warn(
                (f"Argument `transform` is set to `False` but forecaster was trained "
                 f"using a transformer {self.transformer_y}. Ensure that the new residuals "
                 f"are already transformed or set `transform=True`.")
            )

        if transform and self.transformer_y is not None:
            warnings.warn(
                (f"Residuals will be transformed using the same transformer used "
                 f"when training the forecaster ({self.transformer_y}). Ensure that the "
                 f"new residuals are on the same scale as the original time series.")
            )

            residuals = transform_series(
                            series            = pd.Series(residuals, name='residuals'),
                            transformer       = self.transformer_y,
                            fit               = False,
                            inverse_transform = False
                        ).to_numpy()

        if len(residuals) > 1000:
            rng = np.random.default_rng(seed=random_state)
            residuals = rng.choice(a=residuals, size=1000, replace=False)

        if append and self.out_sample_residuals is not None:
            free_space = max(0, 1000 - len(self.out_sample_residuals))
            if len(residuals) < free_space:
                residuals = np.hstack((
                                self.out_sample_residuals,
                                residuals
                            ))
            else:
                residuals = np.hstack((
                                self.out_sample_residuals,
                                residuals[:free_space]
                            ))

        self.out_sample_residuals = residuals


    def get_feature_importances(
        self
    ) -> pd.DataFrame:
        """
        Return feature importances of the regressor stored in the
        forecaster. Only valid when regressor stores internally the feature
        importances in the attribute `feature_importances_` or `coef_`.

        Parameters
        ----------
        self

        Returns
        -------
        feature_importances : pandas DataFrame
            Feature importances associated with each predictor.

        """

        if not self.fitted:
            raise sklearn.exceptions.NotFittedError(
                ("This forecaster is not fitted yet. Call `fit` with appropriate "
                 "arguments before using `get_feature_importances()`.")
            )

        if isinstance(self.regressor, sklearn.pipeline.Pipeline):
            estimator = self.regressor[-1]
        else:
            estimator = self.regressor

        if hasattr(estimator, 'feature_importances_'):
            feature_importances = estimator.feature_importances_
        elif hasattr(estimator, 'coef_'):
            feature_importances = estimator.coef_
        else:
            warnings.warn(
                (f"Impossible to access feature importances for regressor of type "
                 f"{type(estimator)}. This method is only valid when the "
                 f"regressor stores internally the feature importances in the "
                 f"attribute `feature_importances_` or `coef_`.")
            )
            feature_importances = None

        if feature_importances is not None:
            feature_importances = pd.DataFrame({
                                      'feature': self.X_train_col_names,
                                      'importance': feature_importances
                                  })

        return feature_importances


    def get_feature_importance(
        self
    ) -> pd.DataFrame:
        """
        This method has been replaced by `get_feature_importances()`.

        Return feature importances of the regressor stored in the
        forecaster. Only valid when regressor stores internally the feature
        importances in the attribute `feature_importances_` or `coef_`.

        Parameters
        ----------
        self

        Returns
        -------
        feature_importances : pandas DataFrame
            Feature importances associated with each predictor.

        """

        warnings.warn(
            ("get_feature_importance() method has been renamed to get_feature_importances(). "
             "This method will be removed in skforecast 0.9.0.")
        )

        return self.get_feature_importances()

`create_sample_weights(self, X_train)` ¶

Crate weights for each observation according to the forecaster's attribute

weight_func.

Parameters:

Name	Type	Description	Default
`X_train`	`DataFrame`	Dataframe generated with the method `create_train_X_y`, first return.	required

Returns:

Type	Description
`ndarray`	Weights to use in `fit` method.

Source code in skforecast/ForecasterAutoreg/ForecasterAutoreg.py

def create_sample_weights(
    self,
    X_train: pd.DataFrame,
)-> np.ndarray:
    """
    Crate weights for each observation according to the forecaster's attribute
    `weight_func`.

    Parameters
    ----------
    X_train : pandas DataFrame
        Dataframe generated with the method `create_train_X_y`, first return.

    Returns
    -------
    sample_weight : numpy ndarray
        Weights to use in `fit` method.

    """

    sample_weight = None

    if self.weight_func is not None:
        sample_weight = self.weight_func(X_train.index)

    if sample_weight is not None:
        if np.isnan(sample_weight).any():
            raise ValueError(
                "The resulting `sample_weight` cannot have NaN values."
            )
        if np.any(sample_weight < 0):
            raise ValueError(
                "The resulting `sample_weight` cannot have negative values."
            )
        if np.sum(sample_weight) == 0:
            raise ValueError(
                ("The resulting `sample_weight` cannot be normalized because "
                 "the sum of the weights is zero.")
            )

    return sample_weight

`create_train_X_y(self, y, exog=None)` ¶

Create training matrices from univariate time series and exogenous

variables.

Parameters:

Name	Type	Description	Default
`y`	`Series`	Training time series.	required
`exog`	`Union[pandas.core.series.Series, pandas.core.frame.DataFrame]`	Exogenous variable/s included as predictor/s. Must have the same number of observations as `y` and their indexes must be aligned.	`None`

Returns:

Type	Description
`Tuple[pandas.core.frame.DataFrame, pandas.core.series.Series]`	Pandas DataFrame with the training values (predictors).

Source code in skforecast/ForecasterAutoreg/ForecasterAutoreg.py

def create_train_X_y(
    self,
    y: pd.Series,
    exog: Optional[Union[pd.Series, pd.DataFrame]]=None
) -> Tuple[pd.DataFrame, pd.Series]:
    """
    Create training matrices from univariate time series and exogenous
    variables.

    Parameters
    ----------        
    y : pandas Series
        Training time series.

    exog : pandas Series, pandas DataFrame, default `None`
        Exogenous variable/s included as predictor/s. Must have the same
        number of observations as `y` and their indexes must be aligned.

    Returns 
    -------
    X_train : pandas DataFrame, shape (len(y) - self.max_lag, len(self.lags))
        Pandas DataFrame with the training values (predictors).

    y_train : pandas Series, shape (len(y) - self.max_lag, )
        Values (target) of the time series related to each row of `X_train`.

    """

    check_y(y=y)
    y = transform_series(
            series            = y,
            transformer       = self.transformer_y,
            fit               = True,
            inverse_transform = False
        )
    y_values, y_index = preprocess_y(y=y)

    if exog is not None:
        if len(exog) != len(y):
            raise ValueError(
                (f'`exog` must have same number of samples as `y`. '
                 f'length `exog`: ({len(exog)}), length `y`: ({len(y)})')
            )
        check_exog(exog=exog, allow_nan=True)
        if isinstance(exog, pd.Series):
            exog = transform_series(
                       series            = exog,
                       transformer       = self.transformer_exog,
                       fit               = True,
                       inverse_transform = False
                   )
        else:
            exog = transform_dataframe(
                       df                = exog,
                       transformer       = self.transformer_exog,
                       fit               = True,
                       inverse_transform = False
                   )

        check_exog(exog=exog, allow_nan=False)
        check_exog_dtypes(exog)
        self.exog_dtypes = get_exog_dtypes(exog=exog)

        _, exog_index = preprocess_exog(exog=exog, return_values=False)
        if not (exog_index[:len(y_index)] == y_index).all():
            raise ValueError(
                ("Different index for `y` and `exog`. They must be equal "
                 "to ensure the correct alignment of values.")
            )

    X_train, y_train = self._create_lags(y=y_values)
    X_train_col_names = [f"lag_{i}" for i in self.lags]
    X_train = pd.DataFrame(
                  data    = X_train,
                  columns = X_train_col_names,
                  index   = y_index[self.max_lag: ]
              )

    if exog is not None:
        # The first `self.max_lag` positions have to be removed from exog
        # since they are not in X_train.
        exog_to_train = exog.iloc[self.max_lag:, ]
        X_train = pd.concat((X_train, exog_to_train), axis=1)

    self.X_train_col_names = X_train.columns.to_list()
    y_train = pd.Series(
                  data  = y_train,
                  index = y_index[self.max_lag: ],
                  name  = 'y'
              )

    return X_train, y_train

`fit(self, y, exog=None)` ¶

Training Forecaster.

Additional arguments to be passed to the fit method of the regressor can be added with the fit_kwargs argument when initializing the forecaster.

Parameters:

Name	Type	Description	Default
`y`	`Series`	Training time series.	required
`exog`	`Union[pandas.core.series.Series, pandas.core.frame.DataFrame]`	Exogenous variable/s included as predictor/s. Must have the same number of observations as `y` and their indexes must be aligned so that y[i] is regressed on exog[i].	`None`

Source code in skforecast/ForecasterAutoreg/ForecasterAutoreg.py

def fit(
    self,
    y: pd.Series,
    exog: Optional[Union[pd.Series, pd.DataFrame]]=None
) -> None:
    """
    Training Forecaster.

    Additional arguments to be passed to the `fit` method of the regressor 
    can be added with the `fit_kwargs` argument when initializing the forecaster.

    Parameters
    ----------        
    y : pandas Series
        Training time series.

    exog : pandas Series, pandas DataFrame, default `None`
        Exogenous variable/s included as predictor/s. Must have the same
        number of observations as `y` and their indexes must be aligned so
        that y[i] is regressed on exog[i].

    Returns 
    -------
    None

    """

    # Reset values in case the forecaster has already been fitted.
    self.index_type          = None
    self.index_freq          = None
    self.last_window         = None
    self.included_exog       = False
    self.exog_type           = None
    self.exog_dtypes         = None
    self.exog_col_names      = None
    self.X_train_col_names   = None
    self.in_sample_residuals = None
    self.fitted              = False
    self.training_range      = None

    if exog is not None:
        self.included_exog = True
        self.exog_type = type(exog)
        self.exog_col_names = \
             exog.columns.to_list() if isinstance(exog, pd.DataFrame) else exog.name

    X_train, y_train = self.create_train_X_y(y=y, exog=exog)
    sample_weight = self.create_sample_weights(X_train=X_train)

    if sample_weight is not None:
        self.regressor.fit(X=X_train, y=y_train, sample_weight=sample_weight,
                           **self.fit_kwargs)
    else:
        self.regressor.fit(X=X_train, y=y_train, **self.fit_kwargs)

    self.fitted = True
    self.fit_date = pd.Timestamp.today().strftime('%Y-%m-%d %H:%M:%S')
    self.training_range = preprocess_y(y=y, return_values=False)[1][[0, -1]]
    self.index_type = type(X_train.index)
    if isinstance(X_train.index, pd.DatetimeIndex):
        self.index_freq = X_train.index.freqstr
    else: 
        self.index_freq = X_train.index.step

    residuals = (y_train - self.regressor.predict(X_train)).to_numpy()

    if len(residuals) > 1000:
        # Only up to 1000 residuals are stored
        rng = np.random.default_rng(seed=123)
        residuals = rng.choice(
                        a       = residuals, 
                        size    = 1000, 
                        replace = False
                    )

    self.in_sample_residuals = residuals

    # The last time window of training data is stored so that lags needed as
    # predictors in the first iteration of `predict()` can be calculated.
    self.last_window = y.iloc[-self.max_lag:].copy()

`get_feature_importance(self)` ¶

This method has been replaced by get_feature_importances().

Return feature importances of the regressor stored in the forecaster. Only valid when regressor stores internally the feature importances in the attribute feature_importances_ or coef_.

Parameters:

Name	Type	Description	Default
`self`	`None`		required

Returns:

Type	Description
`DataFrame`	Feature importances associated with each predictor.

Source code in skforecast/ForecasterAutoreg/ForecasterAutoreg.py

def get_feature_importance(
    self
) -> pd.DataFrame:
    """
    This method has been replaced by `get_feature_importances()`.

    Return feature importances of the regressor stored in the
    forecaster. Only valid when regressor stores internally the feature
    importances in the attribute `feature_importances_` or `coef_`.

    Parameters
    ----------
    self

    Returns
    -------
    feature_importances : pandas DataFrame
        Feature importances associated with each predictor.

    """

    warnings.warn(
        ("get_feature_importance() method has been renamed to get_feature_importances(). "
         "This method will be removed in skforecast 0.9.0.")
    )

    return self.get_feature_importances()

`get_feature_importances(self)` ¶

Return feature importances of the regressor stored in the

forecaster. Only valid when regressor stores internally the feature importances in the attribute feature_importances_ or coef_.

Parameters:

Name	Type	Description	Default
`self`	`None`		required

Returns:

Type	Description
`DataFrame`	Feature importances associated with each predictor.

Source code in skforecast/ForecasterAutoreg/ForecasterAutoreg.py

def get_feature_importances(
    self
) -> pd.DataFrame:
    """
    Return feature importances of the regressor stored in the
    forecaster. Only valid when regressor stores internally the feature
    importances in the attribute `feature_importances_` or `coef_`.

    Parameters
    ----------
    self

    Returns
    -------
    feature_importances : pandas DataFrame
        Feature importances associated with each predictor.

    """

    if not self.fitted:
        raise sklearn.exceptions.NotFittedError(
            ("This forecaster is not fitted yet. Call `fit` with appropriate "
             "arguments before using `get_feature_importances()`.")
        )

    if isinstance(self.regressor, sklearn.pipeline.Pipeline):
        estimator = self.regressor[-1]
    else:
        estimator = self.regressor

    if hasattr(estimator, 'feature_importances_'):
        feature_importances = estimator.feature_importances_
    elif hasattr(estimator, 'coef_'):
        feature_importances = estimator.coef_
    else:
        warnings.warn(
            (f"Impossible to access feature importances for regressor of type "
             f"{type(estimator)}. This method is only valid when the "
             f"regressor stores internally the feature importances in the "
             f"attribute `feature_importances_` or `coef_`.")
        )
        feature_importances = None

    if feature_importances is not None:
        feature_importances = pd.DataFrame({
                                  'feature': self.X_train_col_names,
                                  'importance': feature_importances
                              })

    return feature_importances

`predict(self, steps, last_window=None, exog=None)` ¶

Predict n steps ahead. It is an recursive process in which, each prediction,

is used as a predictor for the next step.

Parameters:

Name Type Description Default

steps

int

Number of future steps predicted.

required

last_window

Optional[pandas.core.series.Series]

Series values used to create the predictors (lags) needed in the first iteration of the prediction (t + 1).

If last_window = None, the values stored in self.last_window are used to calculate the initial predictors, and the predictions start right after training data.

None

exog

Union[pandas.core.series.Series, pandas.core.frame.DataFrame]

Exogenous variable/s included as predictor/s.

None

Returns:

Type	Description
`Series`	Predicted values.

Source code in skforecast/ForecasterAutoreg/ForecasterAutoreg.py

def predict(
    self,
    steps: int,
    last_window: Optional[pd.Series]=None,
    exog: Optional[Union[pd.Series, pd.DataFrame]]=None
) -> pd.Series:
    """
    Predict n steps ahead. It is an recursive process in which, each prediction,
    is used as a predictor for the next step.

    Parameters
    ----------
    steps : int
        Number of future steps predicted.

    last_window : pandas Series, default `None`
        Series values used to create the predictors (lags) needed in the 
        first iteration of the prediction (t + 1).

        If `last_window = None`, the values stored in `self.last_window` are
        used to calculate the initial predictors, and the predictions start
        right after training data.

    exog : pandas Series, pandas DataFrame, default `None`
        Exogenous variable/s included as predictor/s.

    Returns 
    -------
    predictions : pandas Series
        Predicted values.

    """

    if last_window is None:
        last_window = copy(self.last_window)

    check_predict_input(
        forecaster_name  = type(self).__name__,
        steps            = steps,
        fitted           = self.fitted,
        included_exog    = self.included_exog,
        index_type       = self.index_type,
        index_freq       = self.index_freq,
        window_size      = self.window_size,
        last_window      = last_window,
        last_window_exog = None,
        exog             = exog,
        exog_type        = self.exog_type,
        exog_col_names   = self.exog_col_names,
        interval         = None,
        alpha            = None,
        max_steps        = None,
        levels           = None,
        series_col_names = None
    ) 

    if exog is not None:
        if isinstance(exog, pd.DataFrame):
            exog = transform_dataframe(
                       df                = exog,
                       transformer       = self.transformer_exog,
                       fit               = False,
                       inverse_transform = False
                   )
        else:
            exog = transform_series(
                       series            = exog,
                       transformer       = self.transformer_exog,
                       fit               = False,
                       inverse_transform = False
                   )
        check_exog_dtypes(exog=exog)
        exog_values = exog.iloc[:steps, ].to_numpy()
    else:
        exog_values = None

    last_window = transform_series(
                      series            = last_window,
                      transformer       = self.transformer_y,
                      fit               = False,
                      inverse_transform = False
                  )
    last_window_values, last_window_index = preprocess_last_window(
                                                last_window = last_window
                                            )

    predictions = self._recursive_predict(
                      steps       = steps,
                      last_window = copy(last_window_values),
                      exog        = copy(exog_values)
                  )

    predictions = pd.Series(
                      data  = predictions,
                      index = expand_index(
                                  index = last_window_index,
                                  steps = steps
                              ),
                      name = 'pred'
                  )

    predictions = transform_series(
                      series            = predictions,
                      transformer       = self.transformer_y,
                      fit               = False,
                      inverse_transform = True
                  )

    return predictions

`predict_bootstrapping(self, steps, last_window=None, exog=None, n_boot=500, random_state=123, in_sample_residuals=True)` ¶

Generate multiple forecasting predictions using a bootstrapping process.

By sampling from a collection of past observed errors (the residuals), each iteration of bootstrapping generates a different set of predictions. See the Notes section for more information.

Parameters:

Name	Type	Description	Default
`steps`	`int`	Number of future steps predicted.	required
`last_window`	`Optional[pandas.core.series.Series]`	Series values used to create the predictors (lags) needed in the first iteration of the prediction (t + 1). If `last_window = None`, the values stored in `self.last_window` are used to calculate the initial predictors, and the predictions start right after training data.	`None`
`exog`	`Union[pandas.core.series.Series, pandas.core.frame.DataFrame]`	Exogenous variable/s included as predictor/s.	`None`
`n_boot`	`int`	Number of bootstrapping iterations used to estimate prediction intervals.	`500`
`random_state`	`int`	Sets a seed to the random generator, so that boot intervals are always deterministic.	`123`
`in_sample_residuals`	`bool`	If `True`, residuals from the training data are used as proxy of prediction error to create prediction intervals. If `False`, out of sample residuals are used. In the latter case, the user should have calculated and stored the residuals within the forecaster (see `set_out_sample_residuals()`).	`True`

Returns:

Type	Description
`DataFrame`	Predictions generated by bootstrapping.

Source code in skforecast/ForecasterAutoreg/ForecasterAutoreg.py

def predict_bootstrapping(
    self,
    steps: int,
    last_window: Optional[pd.Series]=None,
    exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
    n_boot: int=500,
    random_state: int=123,
    in_sample_residuals: bool=True
) -> pd.DataFrame:
    """
    Generate multiple forecasting predictions using a bootstrapping process. 
    By sampling from a collection of past observed errors (the residuals),
    each iteration of bootstrapping generates a different set of predictions. 
    See the Notes section for more information. 

    Parameters
    ----------   
    steps : int
        Number of future steps predicted.

    last_window : pandas Series, default `None`
        Series values used to create the predictors (lags) needed in the 
        first iteration of the prediction (t + 1).

        If `last_window = None`, the values stored in `self.last_window` are
        used to calculate the initial predictors, and the predictions start
        right after training data.

    exog : pandas Series, pandas DataFrame, default `None`
        Exogenous variable/s included as predictor/s.

    n_boot : int, default `500`
        Number of bootstrapping iterations used to estimate prediction
        intervals.

    random_state : int, default `123`
        Sets a seed to the random generator, so that boot intervals are always 
        deterministic.

    in_sample_residuals : bool, default `True`
        If `True`, residuals from the training data are used as proxy of
        prediction error to create prediction intervals. If `False`, out of
        sample residuals are used. In the latter case, the user should have
        calculated and stored the residuals within the forecaster (see
        `set_out_sample_residuals()`).

    Returns 
    -------
    boot_predictions : pandas DataFrame, shape (steps, n_boot)
        Predictions generated by bootstrapping.

    Notes
    -----
    More information about prediction intervals in forecasting:
    https://otexts.com/fpp3/prediction-intervals.html#prediction-intervals-from-bootstrapped-residuals
    Forecasting: Principles and Practice (3nd ed) Rob J Hyndman and George Athanasopoulos.

    """

    if not in_sample_residuals and self.out_sample_residuals is None:
        raise ValueError(
            ('`forecaster.out_sample_residuals` is `None`. Use '
             '`in_sample_residuals=True` or method `set_out_sample_residuals()` '
             'before `predict_interval()`, `predict_bootstrapping()` or '
             '`predict_dist()`.')
        )

    if last_window is None:
        last_window = copy(self.last_window)

    check_predict_input(
        forecaster_name  = type(self).__name__,
        steps            = steps,
        fitted           = self.fitted,
        included_exog    = self.included_exog,
        index_type       = self.index_type,
        index_freq       = self.index_freq,
        window_size      = self.window_size,
        last_window      = last_window,
        last_window_exog = None,
        exog             = exog,
        exog_type        = self.exog_type,
        exog_col_names   = self.exog_col_names,
        interval         = None,
        alpha            = None,
        max_steps        = None,
        levels           = None,
        series_col_names = None
    )

    if exog is not None:
        if isinstance(exog, pd.DataFrame):
            exog = transform_dataframe(
                       df                = exog,
                       transformer       = self.transformer_exog,
                       fit               = False,
                       inverse_transform = False
                   )
        else:
            exog = transform_series(
                       series            = exog,
                       transformer       = self.transformer_exog,
                       fit               = False,
                       inverse_transform = False
                   )
        exog_values = exog.iloc[:steps, ].to_numpy()
    else:
        exog_values = None

    last_window = transform_series(
                      series            = last_window,
                      transformer       = self.transformer_y,
                      fit               = False,
                      inverse_transform = False
                  )
    last_window_values, last_window_index = preprocess_last_window(
                                                last_window = last_window
                                            )

    boot_predictions = np.full(
                           shape      = (steps, n_boot),
                           fill_value = np.nan,
                           dtype      = float
                       )
    rng = np.random.default_rng(seed=random_state)
    seeds = rng.integers(low=0, high=10000, size=n_boot)

    if in_sample_residuals:
        residuals = self.in_sample_residuals
    else:
        residuals = self.out_sample_residuals

    for i in range(n_boot):
        # In each bootstraping iteration the initial last_window and exog 
        # need to be restored.
        last_window_boot = last_window_values.copy()
        exog_boot = exog_values.copy() if exog is not None else None

        rng = np.random.default_rng(seed=seeds[i])
        sample_residuals = rng.choice(
                               a       = residuals,
                               size    = steps,
                               replace = True
                           )

        for step in range(steps):

            prediction = self._recursive_predict(
                             steps       = 1,
                             last_window = last_window_boot,
                             exog        = exog_boot 
                         )

            prediction_with_residual  = prediction + sample_residuals[step]
            boot_predictions[step, i] = prediction_with_residual

            last_window_boot = np.append(
                                   last_window_boot[1:],
                                   prediction_with_residual
                               )

            if exog is not None:
                exog_boot = exog_boot[1:]

    boot_predictions = pd.DataFrame(
                           data    = boot_predictions,
                           index   = expand_index(last_window_index, steps=steps),
                           columns = [f"pred_boot_{i}" for i in range(n_boot)]
                       )

    if self.transformer_y:
        for col in boot_predictions.columns:
            boot_predictions[col] = transform_series(
                                        series            = boot_predictions[col],
                                        transformer       = self.transformer_y,
                                        fit               = False,
                                        inverse_transform = True
                                    )

    return boot_predictions

`predict_dist(self, steps, distribution, last_window=None, exog=None, n_boot=500, random_state=123, in_sample_residuals=True)` ¶

Fit a given probability distribution for each step. After generating

multiple forecasting predictions through a bootstrapping process, each step is fitted to the given distribution.

Parameters:

Name	Type	Description	Default
`steps`	`int`	Number of future steps predicted.	required
`distribution`	`object`	A distribution object from scipy.stats.	required
`last_window`	`Optional[pandas.core.series.Series]`	Series values used to create the predictors (lags) needed in the first iteration of the prediction (t + 1). If `last_window = None`, the values stored in`self.last_window` are used to calculate the initial predictors, and the predictions start right after training data.	`None`
`exog`	`Union[pandas.core.series.Series, pandas.core.frame.DataFrame]`	Exogenous variable/s included as predictor/s.	`None`
`n_boot`	`int`	Number of bootstrapping iterations used to estimate prediction intervals.	`500`
`random_state`	`int`	Sets a seed to the random generator, so that boot intervals are always deterministic.	`123`
`in_sample_residuals`	`bool`	If `True`, residuals from the training data are used as proxy of prediction error to create prediction intervals. If `False`, out of sample residuals are used. In the latter case, the user should have calculated and stored the residuals within the forecaster (see `set_out_sample_residuals()`).	`True`

Returns:

Type	Description
`DataFrame`	Distribution parameters estimated for each step.

Source code in skforecast/ForecasterAutoreg/ForecasterAutoreg.py

def predict_dist(
    self,
    steps: int,
    distribution: object,
    last_window: Optional[pd.Series]=None,
    exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
    n_boot: int=500,
    random_state: int=123,
    in_sample_residuals: bool=True
) -> pd.DataFrame:
    """
    Fit a given probability distribution for each step. After generating 
    multiple forecasting predictions through a bootstrapping process, each 
    step is fitted to the given distribution.

    Parameters
    ---------- 
    steps : int
        Number of future steps predicted.

    distribution : Object
        A distribution object from scipy.stats.

    last_window : pandas Series, default `None`
        Series values used to create the predictors (lags) needed in the 
        first iteration of the prediction (t + 1).

        If `last_window = None`, the values stored in` self.last_window` are
        used to calculate the initial predictors, and the predictions start
        right after training data.

    exog : pandas Series, pandas DataFrame, default `None`
        Exogenous variable/s included as predictor/s.

    n_boot : int, default `500`
        Number of bootstrapping iterations used to estimate prediction
        intervals.

    random_state : int, default `123`
        Sets a seed to the random generator, so that boot intervals are always 
        deterministic.

    in_sample_residuals : bool, default `True`
        If `True`, residuals from the training data are used as proxy of
        prediction error to create prediction intervals. If `False`, out of
        sample residuals are used. In the latter case, the user should have
        calculated and stored the residuals within the forecaster (see
        `set_out_sample_residuals()`).

    Returns 
    -------
    predictions : pandas DataFrame
        Distribution parameters estimated for each step.

    """

    boot_samples = self.predict_bootstrapping(
                       steps               = steps,
                       last_window         = last_window,
                       exog                = exog,
                       n_boot              = n_boot,
                       random_state        = random_state,
                       in_sample_residuals = in_sample_residuals
                   )       

    param_names = [p for p in inspect.signature(distribution._pdf).parameters if not p=='x'] + ["loc","scale"]
    param_values = np.apply_along_axis(lambda x: distribution.fit(x), axis=1, arr=boot_samples)
    predictions = pd.DataFrame(
                      data    = param_values,
                      columns = param_names,
                      index   = boot_samples.index
                  )

    return predictions

`predict_interval(self, steps, last_window=None, exog=None, interval=[5, 95], n_boot=500, random_state=123, in_sample_residuals=True)` ¶

Iterative process in which each prediction is used as a predictor

for the next step, and bootstrapping is used to estimate prediction intervals. Both predictions and intervals are returned.

Parameters:

Name	Type	Description	Default
`steps`	`int`	Number of future steps predicted.	required
`last_window`	`Optional[pandas.core.series.Series]`	Series values used to create the predictors (lags) needed in the first iteration of the prediction (t + 1). If `last_window = None`, the values stored in`self.last_window` are used to calculate the initial predictors, and the predictions start right after training data.	`None`
`exog`	`Union[pandas.core.series.Series, pandas.core.frame.DataFrame]`	Exogenous variable/s included as predictor/s.	`None`
`interval`	`list`	Confidence of the prediction interval estimated. Sequence of percentiles to compute, which must be between 0 and 100 inclusive. For example, interval of 95% should be as `interval = [2.5, 97.5]`.	`[5, 95]`
`n_boot`	`int`	Number of bootstrapping iterations used to estimate prediction intervals.	`500`
`random_state`	`int`	Sets a seed to the random generator, so that boot intervals are always deterministic.	`123`
`in_sample_residuals`	`bool`	If `True`, residuals from the training data are used as proxy of prediction error to create prediction intervals. If `False`, out of sample residuals are used. In the latter case, the user should have calculated and stored the residuals within the forecaster (see `set_out_sample_residuals()`).	`True`

Returns:

Type Description

DataFrame

Values predicted by the forecaster and their estimated interval:

pred: predictions.
lower_bound: lower bound of the interval.
upper_bound: upper bound interval of the interval.

Source code in skforecast/ForecasterAutoreg/ForecasterAutoreg.py

def predict_interval(
    self,
    steps: int,
    last_window: Optional[pd.Series]=None,
    exog: Optional[Union[pd.Series, pd.DataFrame]]=None,
    interval: list=[5, 95],
    n_boot: int=500,
    random_state: int=123,
    in_sample_residuals: bool=True
) -> pd.DataFrame:
    """
    Iterative process in which each prediction is used as a predictor
    for the next step, and bootstrapping is used to estimate prediction
    intervals. Both predictions and intervals are returned.

    Parameters
    ---------- 
    steps : int
        Number of future steps predicted.

    last_window : pandas Series, default `None`
        Series values used to create the predictors (lags) needed in the 
        first iteration of the prediction (t + 1).

        If `last_window = None`, the values stored in` self.last_window` are
        used to calculate the initial predictors, and the predictions start
        right after training data.

    exog : pandas Series, pandas DataFrame, default `None`
        Exogenous variable/s included as predictor/s.

    interval : list, default `[5, 95]`
        Confidence of the prediction interval estimated. Sequence of 
        percentiles to compute, which must be between 0 and 100 inclusive. 
        For example, interval of 95% should be as `interval = [2.5, 97.5]`.

    n_boot : int, default `500`
        Number of bootstrapping iterations used to estimate prediction
        intervals.

    random_state : int, default `123`
        Sets a seed to the random generator, so that boot intervals are always 
        deterministic.

    in_sample_residuals : bool, default `True`
        If `True`, residuals from the training data are used as proxy of
        prediction error to create prediction intervals. If `False`, out of
        sample residuals are used. In the latter case, the user should have
        calculated and stored the residuals within the forecaster (see
        `set_out_sample_residuals()`).

    Returns 
    -------
    predictions : pandas DataFrame
        Values predicted by the forecaster and their estimated interval:

        - pred: predictions.
        - lower_bound: lower bound of the interval.
        - upper_bound: upper bound interval of the interval.

    Notes
    -----
    More information about prediction intervals in forecasting:
    https://otexts.com/fpp2/prediction-intervals.html
    Forecasting: Principles and Practice (2nd ed) Rob J Hyndman and
    George Athanasopoulos.

    """

    check_interval(interval=interval)

    predictions = self.predict(
                      steps       = steps,
                      last_window = last_window,
                      exog        = exog
                  )

    boot_predictions = self.predict_bootstrapping(
                           steps               = steps,
                           last_window         = last_window,
                           exog                = exog,
                           n_boot              = n_boot,
                           random_state        = random_state,
                           in_sample_residuals = in_sample_residuals
                       )

    interval = np.array(interval)/100
    predictions_interval = boot_predictions.quantile(q=interval, axis=1).transpose()
    predictions_interval.columns = ['lower_bound', 'upper_bound']
    predictions = pd.concat((predictions, predictions_interval), axis=1)

    return predictions

`set_fit_kwargs(self, fit_kwargs)` ¶

Set new values for the additional keyword arguments passed to the fit

method of the regressor.

Parameters:

Name	Type	Description	Default
`fit_kwargs`	`dict`	Dict of the form {"argument": new_value}.	required

Source code in skforecast/ForecasterAutoreg/ForecasterAutoreg.py

def set_fit_kwargs(
    self, 
    fit_kwargs: dict
) -> None:
    """
    Set new values for the additional keyword arguments passed to the `fit` 
    method of the regressor.

    Parameters
    ----------
    fit_kwargs : dict
        Dict of the form {"argument": new_value}.

    Returns 
    -------
    None

    """

    self.fit_kwargs = check_select_fit_kwargs(self.regressor, fit_kwargs=fit_kwargs)

`set_lags(self, lags)` ¶

Set new value to the attribute lags.

Attributes max_lag and window_size are also updated.

Parameters:

Name	Type	Description	Default
`lags`	`Union[int, list, numpy.ndarray, range]`	Lags used as predictors. Index starts at 1, so lag 1 is equal to t-1. `int`: include lags from 1 to `lags`. `list` or `np.ndarray`: include only lags present in `lags`.	required

Source code in skforecast/ForecasterAutoreg/ForecasterAutoreg.py

def set_lags(
    self, 
    lags: Union[int, list, np.ndarray, range]
) -> None:
    """
    Set new value to the attribute `lags`.
    Attributes `max_lag` and `window_size` are also updated.

    Parameters
    ----------
    lags : int, list, 1D np.ndarray, range
        Lags used as predictors. Index starts at 1, so lag 1 is equal to t-1.
            `int`: include lags from 1 to `lags`.
            `list` or `np.ndarray`: include only lags present in `lags`.

    Returns 
    -------
    None

    """

    self.lags = initialize_lags(type(self).__name__, lags)
    self.max_lag = max(self.lags)
    self.window_size = max(self.lags)

`set_out_sample_residuals(self, residuals, append=True, transform=True, random_state=123)` ¶

Set new values to the attribute out_sample_residuals. Out of sample

residuals are meant to be calculated using observations that did not participate in the training process.

Parameters:

Name	Type	Description	Default
`residuals`	`ndarray`	Values of residuals. If len(residuals) > 1000, only a random sample of 1000 values are stored.	required
`append`	`bool`	If `True`, new residuals are added to the once already stored in the attribute `out_sample_residuals`. Once the limit of 1000 values is reached, no more values are appended. If False, `out_sample_residuals` is overwritten with the new residuals.	`True`
`transform`	`bool`	If `True`, new residuals are transformed using self.transformer_y.	`True`
`random_state`	`int`	Sets a seed to the random sampling for reproducible output.	`123`

Source code in skforecast/ForecasterAutoreg/ForecasterAutoreg.py

def set_out_sample_residuals(
    self, 
    residuals: np.ndarray, 
    append: bool=True,
    transform: bool=True,
    random_state: int=123
)-> None:
    """
    Set new values to the attribute `out_sample_residuals`. Out of sample
    residuals are meant to be calculated using observations that did not
    participate in the training process.

    Parameters
    ----------
    residuals : numpy ndarray
        Values of residuals. If len(residuals) > 1000, only a random sample
        of 1000 values are stored.

    append : bool, default `True`
        If `True`, new residuals are added to the once already stored in the
        attribute `out_sample_residuals`. Once the limit of 1000 values is
        reached, no more values are appended. If False, `out_sample_residuals`
        is overwritten with the new residuals.

    transform : bool, default `True`
        If `True`, new residuals are transformed using self.transformer_y.

    random_state : int, default `123`
        Sets a seed to the random sampling for reproducible output.

    Returns 
    -------
    self

    """

    if not isinstance(residuals, np.ndarray):
        raise TypeError(
            f"`residuals` argument must be `numpy ndarray`. Got {type(residuals)}."
        )

    if not transform and self.transformer_y is not None:
        warnings.warn(
            (f"Argument `transform` is set to `False` but forecaster was trained "
             f"using a transformer {self.transformer_y}. Ensure that the new residuals "
             f"are already transformed or set `transform=True`.")
        )

    if transform and self.transformer_y is not None:
        warnings.warn(
            (f"Residuals will be transformed using the same transformer used "
             f"when training the forecaster ({self.transformer_y}). Ensure that the "
             f"new residuals are on the same scale as the original time series.")
        )

        residuals = transform_series(
                        series            = pd.Series(residuals, name='residuals'),
                        transformer       = self.transformer_y,
                        fit               = False,
                        inverse_transform = False
                    ).to_numpy()

    if len(residuals) > 1000:
        rng = np.random.default_rng(seed=random_state)
        residuals = rng.choice(a=residuals, size=1000, replace=False)

    if append and self.out_sample_residuals is not None:
        free_space = max(0, 1000 - len(self.out_sample_residuals))
        if len(residuals) < free_space:
            residuals = np.hstack((
                            self.out_sample_residuals,
                            residuals
                        ))
        else:
            residuals = np.hstack((
                            self.out_sample_residuals,
                            residuals[:free_space]
                        ))

    self.out_sample_residuals = residuals

`set_params(self, params)` ¶

Set new values to the parameters of the scikit learn model stored in the

forecaster.

Parameters:

Name	Type	Description	Default
`params`	`dict`	Parameters values.	required

Source code in skforecast/ForecasterAutoreg/ForecasterAutoreg.py

def set_params(
    self, 
    params: dict
) -> None:
    """
    Set new values to the parameters of the scikit learn model stored in the
    forecaster.

    Parameters
    ----------
    params : dict
        Parameters values.

    Returns 
    -------
    self

    """

    self.regressor = clone(self.regressor)
    self.regressor.set_params(**params)

`_create_lags(self, y)` `private` ¶

Transforms a 1d array into a 2d array (X) and a 1d array (y). Each row

in X is associated with a value of y and it represents the lags that precede it.

Notice that, the returned matrix X_data, contains the lag 1 in the first column, the lag 2 in the second column and so on.

Parameters:

Name	Type	Description	Default
`y`	`ndarray`	Training time series.	required

Returns:

Type	Description
`Tuple[numpy.ndarray, numpy.ndarray]`	2d numpy array with the lagged values (predictors).

Source code in skforecast/ForecasterAutoreg/ForecasterAutoreg.py

def _create_lags(
    self, 
    y: np.ndarray
) -> Tuple[np.ndarray, np.ndarray]:
    """       
    Transforms a 1d array into a 2d array (X) and a 1d array (y). Each row
    in X is associated with a value of y and it represents the lags that
    precede it.

    Notice that, the returned matrix X_data, contains the lag 1 in the first
    column, the lag 2 in the second column and so on.

    Parameters
    ----------        
    y : 1d numpy ndarray
        Training time series.

    Returns 
    -------
    X_data : 2d numpy ndarray, shape (samples - max(self.lags), len(self.lags))
        2d numpy array with the lagged values (predictors).

    y_data : 1d numpy ndarray, shape (samples - max(self.lags),)
        Values of the time series related to each row of `X_data`.

    """

    n_splits = len(y) - self.max_lag
    if n_splits <= 0:
        raise ValueError(
            f"The maximum lag ({self.max_lag}) must be less than the length "
            f"of the series ({len(y)})."
        )

    X_data = np.full(shape=(n_splits, len(self.lags)), fill_value=np.nan, dtype=float)

    for i, lag in enumerate(self.lags):
        X_data[:, i] = y[self.max_lag - lag: -lag]

    y_data = y[self.max_lag:]

    return X_data, y_data

`_recursive_predict(self, steps, last_window, exog=None)` `private` ¶

Predict n steps ahead. It is an iterative process in which, each prediction,

is used as a predictor for the next step.

Parameters:

Name	Type	Description	Default
`steps`	`int`	Number of future steps predicted.	required
`last_window`	`ndarray`	Series values used to create the predictors (lags) needed in the first iteration of the prediction (t + 1).	required
`exog`	`Optional[numpy.ndarray]`	Exogenous variable/s included as predictor/s.	`None`

Returns:

Type	Description
`ndarray`	Predicted values.

Source code in skforecast/ForecasterAutoreg/ForecasterAutoreg.py

def _recursive_predict(
    self,
    steps: int,
    last_window: np.ndarray,
    exog: Optional[np.ndarray]=None
) -> np.ndarray:
    """
    Predict n steps ahead. It is an iterative process in which, each prediction,
    is used as a predictor for the next step.

    Parameters
    ----------
    steps : int
        Number of future steps predicted.

    last_window : numpy ndarray
        Series values used to create the predictors (lags) needed in the 
        first iteration of the prediction (t + 1).

    exog : numpy ndarray, default `None`
        Exogenous variable/s included as predictor/s.

    Returns 
    -------
    predictions : numpy ndarray
        Predicted values.

    """

    predictions = np.full(shape=steps, fill_value=np.nan)

    for i in range(steps):
        X = last_window[-self.lags].reshape(1, -1)
        if exog is not None:
            X = np.column_stack((X, exog[i, ].reshape(1, -1)))
        with warnings.catch_warnings():
            # Suppress scikit-learn warning: "X does not have valid feature names,
            # but NoOpTransformer was fitted with feature names".
            warnings.simplefilter("ignore")
            prediction = self.regressor.predict(X)
            predictions[i] = prediction.ravel()[0]

        # Update `last_window` values. The first position is discarded and 
        # the new prediction is added at the end.
        last_window = np.append(last_window[1:], prediction)

    return predictions

`repr(self)` `special` ¶

Information displayed when a ForecasterAutoreg object is printed.

Source code in skforecast/ForecasterAutoreg/ForecasterAutoreg.py

def __repr__(
    self
) -> str:
    """
    Information displayed when a ForecasterAutoreg object is printed.
    """

    if isinstance(self.regressor, sklearn.pipeline.Pipeline):
        name_pipe_steps = tuple(name + "__" for name in self.regressor.named_steps.keys())
        params = {key : value for key, value in self.regressor.get_params().items() \
                  if key.startswith(name_pipe_steps)}
    else:
        params = self.regressor.get_params(deep=True)

    info = (
        f"{'=' * len(type(self).__name__)} \n"
        f"{type(self).__name__} \n"
        f"{'=' * len(type(self).__name__)} \n"
        f"Regressor: {self.regressor} \n"
        f"Lags: {self.lags} \n"
        f"Transformer for y: {self.transformer_y} \n"
        f"Transformer for exog: {self.transformer_exog} \n"
        f"Window size: {self.window_size} \n"
        f"Weight function included: {True if self.weight_func is not None else False} \n"
        f"Exogenous included: {self.included_exog} \n"
        f"Type of exogenous variable: {self.exog_type} \n"
        f"Exogenous variables names: {self.exog_col_names} \n"
        f"Training range: {self.training_range.to_list() if self.fitted else None} \n"
        f"Training index type: {str(self.index_type).split('.')[-1][:-2] if self.fitted else None} \n"
        f"Training index frequency: {self.index_freq if self.fitted else None} \n"
        f"Regressor parameters: {params} \n"
        f"fit_kwargs: {self.fit_kwargs} \n"
        f"Creation date: {self.creation_date} \n"
        f"Last fit date: {self.fit_date} \n"
        f"Skforecast version: {self.skforcast_version} \n"
        f"Python version: {self.python_version} \n"
        f"Forecaster id: {self.forecaster_id} \n"
    )

    return info

ForecasterAutoreg¶