`stats`¶

skforecast.stats._arima.Arima ¶


Arima(
    order=(1, 0, 0),
    seasonal_order=(0, 0, 0),
    m=1,
    include_mean=True,
    transform_pars=True,
    method="CSS-ML",
    n_cond=None,
    SSinit="Gardner1980",
    optim_method="BFGS",
    optim_kwargs=None,
    kappa=1000000.0,
    max_p=5,
    max_q=5,
    max_P=2,
    max_Q=2,
    max_order=5,
    max_d=2,
    max_D=1,
    start_p=2,
    start_q=2,
    start_P=1,
    start_Q=1,
    stationary=False,
    seasonal=True,
    ic="aicc",
    stepwise=True,
    nmodels=94,
    trace=False,
    approximation=None,
    truncate=None,
    test="kpss",
    test_kwargs=None,
    seasonal_test="seas",
    seasonal_test_kwargs=None,
    allowdrift=True,
    allowmean=True,
    lambda_bc=None,
    biasadj=False,
)

Bases: BaseEstimator, RegressorMixin

Scikit-learn style wrapper for the ARIMA (AutoRegressive Integrated Moving Average) model and auto arima selection algorithm.

This estimator treats a univariate time series as input. Call fit(y) with a 1D array-like of observations in time order, then produce out-of-sample forecasts via predict(steps) and prediction intervals via predict_interval(steps, level=...). In-sample diagnostics are available through fitted_, residuals_() and summary().

Parameters:

Name	Type	Description	Default
`order`	`tuple of int or None`	The (p, d, q) order of the non-seasonal ARIMA model: - p: AR order (number of lag observations) - d: Degree of differencing (number of times to difference the series) - q: MA order (size of moving average window) If None, the order will be automatically selected using auto_arima during fitting.	`(1, 0, 0)`
`seasonal_order`	`tuple of int or None`	The (P, D, Q) order of the seasonal component: - P: Seasonal AR order - D: Seasonal differencing order - Q: Seasonal MA order If None, the seasonal order will be automatically selected using auto_arima during fitting.	`(0, 0, 0)`
`m`	`int`	Seasonal period (e.g., 12 for monthly data with yearly seasonality, 4 for quarterly data). Set to 1 for non-seasonal models.	`1`
`include_mean`	`bool`	Whether to include a mean/intercept term in the model. Only applies when there is no differencing (d=0 and D=0).	`True`
`transform_pars`	`bool`	Whether to transform parameters to ensure stationarity and invertibility during optimization.	`True`
`method`	`str`	Estimation method. Options: - "CSS-ML": Conditional sum of squares for initial values, then maximum likelihood - "ML": Maximum likelihood only - "CSS": Conditional sum of squares only	`"CSS-ML"`
`n_cond`	`int`	Number of initial observations to use for conditional sum of squares. If None, defaults to max(p + dm + Pm, q + Q*m).	`None`
`SSinit`	`str`	Method for state-space initialization. Options: - "Gardner1980": Gardner's method (default, more numerically stable) - "Rossignol2011": Rossignol's method (alternative)	`"Gardner1980"`
`optim_method`	`str`	Optimization method passed to scipy.optimize.minimize. Common options include "BFGS", "L-BFGS-B", "Nelder-Mead", etc.	`"BFGS"`
`optim_kwargs`	`dict or None`	Additional options passed to the optimizer (e.g., maxiter, ftol).	`{'maxiter': 1000}`
`kappa`	`float`	Prior variance for diffuse states in the Kalman filter.	`1e6`
`max_p`	`int`	Maximum AR order for automatic model selection.	`5`
`max_q`	`int`	Maximum MA order for automatic model selection.	`5`
`max_P`	`int`	Maximum seasonal AR order for automatic model selection.	`2`
`max_Q`	`int`	Maximum seasonal MA order for automatic model selection.	`2`
`max_order`	`int`	Maximum sum of p+q+P+Q for automatic model selection.	`5`
`max_d`	`int`	Maximum non-seasonal differencing order for automatic selection.	`2`
`max_D`	`int`	Maximum seasonal differencing order for automatic selection.	`1`
`start_p`	`int`	Starting AR order for stepwise search.	`2`
`start_q`	`int`	Starting MA order for stepwise search.	`2`
`start_P`	`int`	Starting seasonal AR order for stepwise search.	`1`
`start_Q`	`int`	Starting seasonal MA order for stepwise search.	`1`
`stationary`	`bool`	Restrict automatic search to stationary models (d=D=0).	`False`
`seasonal`	`bool`	Include seasonal components in automatic search.	`True`
`ic`	`str`	Information criterion for automatic model selection: "aicc", "aic", or "bic".	`"aicc"`
`stepwise`	`bool`	Use stepwise search (faster) or exhaustive grid search for automatic selection.	`True`
`nmodels`	`int`	Maximum number of models to try in stepwise search.	`94`
`trace`	`bool`	Print progress during automatic model selection.	`False`
`approximation`	`bool or None`	Use CSS approximation during automatic search. If None, auto-determined based on data size.	`None`
`truncate`	`int or None`	Truncate series to this length for approximation offset computation.	`None`
`test`	`str`	Unit root test for automatic differencing determination: "kpss", "adf", or "pp".	`"kpss"`
`test_kwargs`	`dict or None`	Additional arguments for unit root test.	`None`
`seasonal_test`	`str`	Seasonal test for automatic seasonal differencing: "seas", "ocsb", "hegy", or "ch".	`"seas"`
`seasonal_test_kwargs`	`dict or None`	Additional arguments for seasonal test.	`None`
`allowdrift`	`bool`	Allow drift term in automatic selection when d+D=1.	`True`
`allowmean`	`bool`	Allow mean term in automatic selection when d+D=0.	`True`
`lambda_bc`	`float, str, or None`	Box-Cox transformation parameter: - None: No transformation - "auto": Automatically select lambda using Guerrero's method - float: Use the specified lambda value (0 = log transform)	`None`
`biasadj`	`bool`	Bias adjustment for Box-Cox back-transformation (produces mean forecasts instead of median).	`False`

Attributes:

Name	Type	Description
`order`	`tuple of int`	(p, d, q) non-seasonal ARIMA order stored on the estimator.
`seasonal_order`	`tuple of int`	(P, D, Q) seasonal ARIMA order stored on the estimator.
`m`	`int`	Seasonal period (e.g., 12 for monthly data).
`include_mean`	`bool`	Whether a mean/intercept term is included in the model.
`transform_pars`	`bool`	Whether parameters are transformed to enforce stationarity/invertibility.
`method`	`str`	Estimation method (e.g., "CSS-ML", "ML", "CSS").
`n_cond`	`int or None`	Number of observations used for conditional sum of squares (if any).
`SSinit`	`str`	State-space initialization method (e.g., "Gardner1980").
`optim_method`	`str`	Optimization method passed to the optimizer (e.g., "BFGS").
`optim_kwargs`	`dict or None`	Additional optimizer options.
`kappa`	`float`	Prior variance for diffuse states in the Kalman filter.
`max_p`	`int, default 5`	Maximum AR order for automatic model selection.
`max_q`	`int, default 5`	Maximum MA order for automatic model selection.
`max_P`	`int, default 2`	Maximum seasonal AR order for automatic model selection.
`max_Q`	`int, default 2`	Maximum seasonal MA order for automatic model selection.
`max_order`	`int, default 5`	Maximum sum of p+q+P+Q for automatic model selection.
`max_d`	`int, default 2`	Maximum non-seasonal differencing order for automatic selection.
`max_D`	`int, default 1`	Maximum seasonal differencing order for automatic selection.
`start_p`	`int, default 2`	Starting AR order for stepwise search.
`start_q`	`int, default 2`	Starting MA order for stepwise search.
`start_P`	`int, default 1`	Starting seasonal AR order for stepwise search.
`start_Q`	`int, default 1`	Starting seasonal MA order for stepwise search.
`stationary`	`bool, default False`	Restrict automatic search to stationary models (d=D=0).
`seasonal`	`bool, default True`	Include seasonal components in automatic search.
`ic`	`str, default "aicc"`	Information criterion for automatic model selection: "aicc", "aic", or "bic".
`stepwise`	`bool, default True`	Use stepwise search (faster) or exhaustive grid search for automatic selection.
`nmodels`	`int, default 94`	Maximum number of models to try in stepwise search.
`trace`	`bool, default False`	Print progress during automatic model selection.
`approximation`	`bool or None, default None`	Use CSS approximation during automatic search. If None, auto-determined based on data size.
`truncate`	`int or None, default None`	Truncate series to this length for approximation offset computation.
`test`	`str, default "kpss"`	Unit root test for automatic differencing determination: "kpss", "adf", or "pp".
`test_kwargs`	`dict or None, default None`	Additional arguments for unit root test.
`seasonal_test`	`str, default "seas"`	Seasonal test for automatic seasonal differencing: "seas", "ocsb", "hegy", or "ch".
`seasonal_test_kwargs`	`dict or None, default None`	Additional arguments for seasonal test.
`allowdrift`	`bool, default True`	Allow drift term in automatic selection when d+D=1.
`allowmean`	`bool, default True`	Allow mean term in automatic selection when d+D=0.
`lambda_bc`	`float, str, or None, default None`	Box-Cox transformation parameter: - None: No transformation - "auto": Automatically select lambda using Guerrero's method - float: Use the specified lambda value (0 = log transform)
`biasadj`	`bool, default False`	Bias adjustment for Box-Cox back-transformation (produces mean forecasts instead of median). Only available for auto arima mode.
`model_`	`dict`	Dictionary containing the fitted ARIMA model with keys: - 'y': Original training series - 'fitted': In-sample fitted values - 'coef': Coefficient DataFrame - 'sigma2': Innovation variance - 'var_coef': Variance-covariance matrix - 'loglik': Log-likelihood - 'aic': Akaike Information Criterion - 'bic': Bayesian Information Criterion - 'arma': ARIMA specification [p, q, P, Q, m, d, D] - 'residuals': Model residuals - 'converged': Convergence status - 'model': State-space model dict - 'method': Estimation method string
`y_train_`	`ndarray of shape (n_samples,)`	Original training series used for fitting.
`coef_`	`ndarray`	Flattened array of fitted coefficients (AR, MA, exogenous, intercept if present).
`coef_names_`	`list of str`	Names of coefficients in coef_.
`sigma2_`	`float`	Innovation variance (residual variance).
`loglik_`	`float`	Log-likelihood of the fitted model.
`aic_`	`float`	Akaike Information Criterion value.
`bic_`	`float or None`	Bayesian Information Criterion value (may be `None` if not available).
`arma_`	`list of int`	ARIMA specification: [p, q, P, Q, m, d, D].
`converged_`	`bool`	Whether the optimization converged successfully.
`n_features_in_`	`int`	Number of features in the target series (always 1, for sklearn compatibility).
`n_exog_names_in_`	`list`	Names of exogenous features seen during fitting (None if no exog provided) or if exog was not a pandas DataFrame.
`n_exog_features_in_`	`int`	Number of exogenous features seen during fitting (0 if no exog provided).
`fitted_values_`	`ndarray of shape (n_samples,)`	In-sample fitted values.
`in_sample_residuals_`	`ndarray of shape (n_samples,)`	In-sample residuals (observed - fitted).
`var_coef_`	`ndarray`	Variance-covariance matrix of coefficients.
`best_params_`	`dict or None`	If auto arima was used, dictionary with 'order', 'seasonal_order' and `m` of the selected best model. Otherwise None.
`is_auto`	`bool`	Flag indicating whether auto arima model selection is used.
`is_memory_reduced`	`bool`	Flag indicating whether reduce_memory() has been called.
`is_fitted`	`bool`	Flag indicating whether the estimator has been fitted.
`estimator_name_`	`str`	String identifier of the fitted model configuration (e.g., "Arima(1,1,1)(0,0,0)[1]"). This is updated after fitting to reflect the selected model.

Notes

The ARIMA model supports exogenous regressors which are incorporated directly into the likelihood function, unlike the two-step approach used in the ARAR model. This means the exogenous variables are modeled jointly with the ARMA errors, providing a more integrated treatment.

The model uses a state-space representation and the Kalman filter for likelihood computation and forecasting, which allows handling of missing values and provides efficient recursive prediction.

Methods:

Name	Description
`fit`	Fit the ARIMA model to a univariate time series.
`predict`	Generate mean forecasts steps ahead.
`predict_interval`	Forecast with prediction intervals.
`get_residuals`	Get in-sample residuals (observed - fitted) from the ARIMA model.
`get_fitted_values`	Get in-sample fitted values from the ARIMA model.
`get_feature_importances`	Get feature importances for Arima model.
`get_score`	Compute R^2 score using in-sample fitted values.
`get_info_criteria`	Get the selected information criterion.
`get_params`	Get parameters for this estimator.
`set_params`	Set the parameters of this estimator and reset the fitted state.
`summary`	Print a summary of the fitted ARIMA model.
`reduce_memory`	Free memory by deleting large attributes after fitting.

Source code in skforecast\stats\_arima.py

def __init__(
    self,
    order: tuple[int, int, int] | None = (1, 0, 0),
    seasonal_order: tuple[int, int, int] | None = (0, 0, 0),
    m: int = 1,
    include_mean: bool = True,
    transform_pars: bool = True,
    method: str = "CSS-ML",
    n_cond: int | None = None,
    SSinit: str = "Gardner1980",
    optim_method: str = "BFGS",
    optim_kwargs: dict | None = None,
    kappa: float = 1e6,
    max_p: int = 5,
    max_q: int = 5,
    max_P: int = 2,
    max_Q: int = 2,
    max_order: int = 5,
    max_d: int = 2,
    max_D: int = 1,
    start_p: int = 2,
    start_q: int = 2,
    start_P: int = 1,
    start_Q: int = 1,
    stationary: bool = False,
    seasonal: bool = True,
    ic: str = "aicc",
    stepwise: bool = True,
    nmodels: int = 94,
    trace: bool = False,
    approximation: bool | None = None,
    truncate: int | None = None,
    test: str = "kpss",
    test_kwargs: dict | None = None,
    seasonal_test: str = "seas",
    seasonal_test_kwargs: dict | None = None,
    allowdrift: bool = True,
    allowmean: bool = True,
    lambda_bc: float | str | None = None,
    biasadj: bool = False,
):

    if order is not None and len(order) != 3:
        raise ValueError(
            f"`order` must be a tuple of length 3, got length {len(order)}"
        )
    if seasonal_order is not None and len(seasonal_order) != 3:
        raise ValueError(
            f"`seasonal_order` must be a tuple of length 3, got length {len(seasonal_order)}"
        )
    if not isinstance(m, int) or m < 1:
        raise ValueError("`m` must be a positive integer (seasonal period).")

    self.order                = order
    self.seasonal_order       = seasonal_order
    self.m                    = m
    self.include_mean         = include_mean
    self.transform_pars       = transform_pars
    self.method               = method
    self.n_cond               = n_cond
    self.SSinit               = SSinit
    self.optim_method         = optim_method
    self.optim_kwargs         = optim_kwargs
    self.kappa                = kappa
    self.max_p                = max_p
    self.max_q                = max_q
    self.max_P                = max_P
    self.max_Q                = max_Q
    self.max_order            = max_order
    self.max_d                = max_d
    self.max_D                = max_D
    self.start_p              = start_p
    self.start_q              = start_q
    self.start_P              = start_P
    self.start_Q              = start_Q
    self.stationary           = stationary
    self.seasonal             = seasonal
    self.ic                   = ic
    self.stepwise             = stepwise
    self.nmodels              = nmodels
    self.trace                = trace
    self.approximation        = approximation
    self.truncate             = truncate
    self.test                 = test
    self.test_kwargs          = test_kwargs
    self.seasonal_test        = seasonal_test
    self.seasonal_test_kwargs = seasonal_test_kwargs
    self.allowdrift           = allowdrift
    self.allowmean            = allowmean
    self.lambda_bc            = lambda_bc
    self.biasadj              = biasadj       

    self.is_auto              = order is None or seasonal_order is None
    self.model_               = None
    self.y_train_             = None
    self.coef_                = None
    self.coef_names_          = None
    self.sigma2_              = None
    self.loglik_              = None
    self.aic_                 = None
    self.bic_                 = None
    self.arma_                = None
    self.converged_           = None
    self.fitted_values_       = None
    self.in_sample_residuals_ = None
    self.var_coef_            = None
    self.n_features_in_       = None
    self.n_exog_names_in_     = None
    self.n_exog_features_in_  = None
    self.is_memory_reduced    = False
    self.is_fitted            = False
    self.best_params_         = None

    if self.optim_kwargs is None:
        self.optim_kwargs = {'maxiter': 1000}

    if self.is_auto:
        estimator_name_ = "AutoArima()"
    else:
        p, d, q = self.order
        P, D, Q = self.seasonal_order
        if P == 0 and D == 0 and Q == 0:
            estimator_name_ = f"Arima({p},{d},{q})"
        else:
            estimator_name_ = f"Arima({p},{d},{q})({P},{D},{Q})[{self.m}]"

    self.estimator_name_ = estimator_name_

order `instance-attribute` ¶


order = order

seasonal_order `instance-attribute` ¶


seasonal_order = seasonal_order

m `instance-attribute` ¶


m = m

include_mean `instance-attribute` ¶


include_mean = include_mean

transform_pars `instance-attribute` ¶


transform_pars = transform_pars

method `instance-attribute` ¶


method = method

n_cond `instance-attribute` ¶


n_cond = n_cond

SSinit `instance-attribute` ¶


SSinit = SSinit

optim_method `instance-attribute` ¶


optim_method = optim_method

optim_kwargs `instance-attribute` ¶


optim_kwargs = optim_kwargs

kappa `instance-attribute` ¶


kappa = kappa

max_p `instance-attribute` ¶


max_p = max_p

max_q `instance-attribute` ¶


max_q = max_q

max_P `instance-attribute` ¶


max_P = max_P

max_Q `instance-attribute` ¶


max_Q = max_Q

max_order `instance-attribute` ¶


max_order = max_order

max_d `instance-attribute` ¶


max_d = max_d

max_D `instance-attribute` ¶


max_D = max_D

start_p `instance-attribute` ¶


start_p = start_p

start_q `instance-attribute` ¶


start_q = start_q

start_P `instance-attribute` ¶


start_P = start_P

start_Q `instance-attribute` ¶


start_Q = start_Q

stationary `instance-attribute` ¶


stationary = stationary

seasonal `instance-attribute` ¶


seasonal = seasonal

ic `instance-attribute` ¶


ic = ic

stepwise `instance-attribute` ¶


stepwise = stepwise

nmodels `instance-attribute` ¶


nmodels = nmodels

trace `instance-attribute` ¶


trace = trace

approximation `instance-attribute` ¶


approximation = approximation

truncate `instance-attribute` ¶


truncate = truncate

test `instance-attribute` ¶


test = test

test_kwargs `instance-attribute` ¶


test_kwargs = test_kwargs

seasonal_test `instance-attribute` ¶


seasonal_test = seasonal_test

seasonal_test_kwargs `instance-attribute` ¶


seasonal_test_kwargs = seasonal_test_kwargs

allowdrift `instance-attribute` ¶


allowdrift = allowdrift

allowmean `instance-attribute` ¶


allowmean = allowmean

lambda_bc `instance-attribute` ¶


lambda_bc = lambda_bc

biasadj `instance-attribute` ¶


biasadj = biasadj

is_auto `instance-attribute` ¶


is_auto = order is None or seasonal_order is None

model_ `instance-attribute` ¶


model_ = None

y_train_ `instance-attribute` ¶


y_train_ = None

coef_ `instance-attribute` ¶


coef_ = None

coef_names_ `instance-attribute` ¶


coef_names_ = None

sigma2_ `instance-attribute` ¶


sigma2_ = None

loglik_ `instance-attribute` ¶


loglik_ = None

aic_ `instance-attribute` ¶


aic_ = None

bic_ `instance-attribute` ¶


bic_ = None

arma_ `instance-attribute` ¶


arma_ = None

converged_ `instance-attribute` ¶


converged_ = None

fitted_values_ `instance-attribute` ¶


fitted_values_ = None

in_sample_residuals_ `instance-attribute` ¶


in_sample_residuals_ = None

var_coef_ `instance-attribute` ¶


var_coef_ = None

n_features_in_ `instance-attribute` ¶


n_features_in_ = None

n_exog_names_in_ `instance-attribute` ¶


n_exog_names_in_ = None

n_exog_features_in_ `instance-attribute` ¶


n_exog_features_in_ = None

is_memory_reduced `instance-attribute` ¶


is_memory_reduced = False

is_fitted `instance-attribute` ¶


is_fitted = False

best_params_ `instance-attribute` ¶


best_params_ = None

estimator_name_ `instance-attribute` ¶


estimator_name_ = estimator_name_

fit ¶


fit(y, exog=None, suppress_warnings=False)

Fit the ARIMA model to a univariate time series.

If order or seasonal_order were not specified during initialization (i.e., set to None), this method will automatically determine the best model using auto arima with stepwise search.

Parameters:

Name	Type	Description	Default
`y`	`pandas Series, numpy ndarray of shape (n_samples,)`	Time-ordered numeric sequence.	required
`exog`	`pandas Series, pandas DataFrame, numpy ndarray of shape (n_samples, n_exog_features)`	Exogenous regressors to include in the model. These are incorporated directly into the ARIMA likelihood function.	`None`
`suppress_warnings`	`bool`	If True, suppress warnings during fitting (e.g., convergence warnings).	`False`

Returns:

Name	Type	Description
`self`	`Arima`	Fitted estimator. After fitting with automatic model selection, the selected `order` and `seasonal_order` are stored in the respective attributes, and `estimator_selected_id_` is updated with the chosen model.

Source code in skforecast\stats\_arima.py

def fit(
    self, 
    y: np.ndarray | pd.Series, 
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None,
    suppress_warnings: bool = False
) -> "Arima":
    """
    Fit the ARIMA model to a univariate time series.

    If `order` or `seasonal_order` were not specified during initialization 
    (i.e., set to None), this method will automatically determine the best 
    model using auto arima with stepwise search.

    Parameters
    ----------
    y : pandas Series, numpy ndarray of shape (n_samples,)
        Time-ordered numeric sequence.
    exog : pandas Series, pandas DataFrame,  numpy ndarray of shape (n_samples, n_exog_features), default None
        Exogenous regressors to include in the model. These are incorporated 
        directly into the ARIMA likelihood function.
    suppress_warnings : bool, default False
        If True, suppress warnings during fitting (e.g., convergence warnings).

    Returns
    -------
    self : Arima
        Fitted estimator. After fitting with automatic model selection, the 
        selected `order` and `seasonal_order` are stored in the respective 
        attributes, and `estimator_selected_id_` is updated with the chosen model.

    """

    self.is_auto              = self.order is None or self.seasonal_order is None
    self.model_               = None
    self.y_train_             = None
    self.coef_                = None
    self.coef_names_          = None
    self.sigma2_              = None
    self.loglik_              = None
    self.aic_                 = None
    self.bic_                 = None
    self.arma_                = None
    self.converged_           = None
    self.fitted_values_       = None
    self.in_sample_residuals_ = None
    self.var_coef_            = None
    self.n_features_in_       = None
    self.n_exog_names_in_     = None
    self.n_exog_features_in_  = None
    self.is_memory_reduced    = False
    self.is_fitted            = False
    self.best_params_         = None

    if not isinstance(y, (np.ndarray, pd.Series)):
        raise TypeError("`y` must be a pandas Series or numpy array.")

    if not isinstance(exog, (type(None), pd.Series, pd.DataFrame, np.ndarray)):
        raise TypeError("`exog` must be a pandas Series, DataFrame, numpy array, or None.")

    y = np.asarray(y, dtype=float)
    if y.ndim == 2 and y.shape[1] == 1:
        y = y.ravel()
    elif y.ndim != 1:
        raise ValueError("`y` must be 1-dimensional.")

    exog_names_in_ = None
    if exog is not None:
        if isinstance(exog, pd.DataFrame):
            exog_names_in_ = list(exog.columns)
        exog = np.asarray(exog, dtype=float)
        if exog.ndim == 1:
            exog = exog.reshape(-1, 1)
        elif exog.ndim != 2:
            raise ValueError("`exog` must be 1- or 2-dimensional.")

        if len(exog) != len(y):
            raise ValueError(
                f"Length of `exog` ({len(exog)}) does not match length of `y` ({len(y)})."
            )

    ctx = (warnings.catch_warnings() if suppress_warnings else nullcontext())
    with ctx:
        if suppress_warnings:
            warnings.simplefilter("ignore")

        if self.is_auto:
            self.model_ = auto_arima(
                y                  = y,
                m                  = self.m,
                d                  = self.order[1] if self.order is not None else None,
                D                  = self.seasonal_order[1] if self.seasonal_order is not None else None,
                max_p              = self.max_p,
                max_q              = self.max_q,
                max_P              = self.max_P,
                max_Q              = self.max_Q,
                max_order          = self.max_order,
                max_d              = self.max_d,
                max_D              = self.max_D,
                start_p            = self.start_p,
                start_q            = self.start_q,
                start_P            = self.start_P,
                start_Q            = self.start_Q,
                stationary         = self.stationary,
                seasonal           = self.seasonal,
                ic                 = self.ic,
                stepwise           = self.stepwise,
                nmodels            = self.nmodels,
                trace              = self.trace,
                approximation      = self.approximation,
                method             = self.method,
                truncate           = self.truncate,
                xreg               = exog,
                test               = self.test,
                test_args          = self.test_kwargs,
                seasonal_test      = self.seasonal_test,
                seasonal_test_args = self.seasonal_test_kwargs,
                allowdrift         = self.allowdrift,
                allowmean          = self.allowmean,
                lambda_bc          = self.lambda_bc,
                biasadj            = self.biasadj,
                SSinit             = self.SSinit,
                kappa              = self.kappa
            )

            best_model_order_ = (
                self.model_['arma'][0],
                self.model_['arma'][5],
                self.model_['arma'][1]
            )
            best_seasonal_order_ = (
                self.model_['arma'][2],
                self.model_['arma'][6],
                self.model_['arma'][3]
            )
            self.best_params_ = {
                'order': best_model_order_,
                'seasonal_order': best_seasonal_order_,
                'm': self.m
            }

            # NOTE: Only needed to update `estimator_name_` when auto arima is used
            p, d, q = best_model_order_
            P, D, Q = best_seasonal_order_
            if P == 0 and D == 0 and Q == 0:
                self.estimator_name_ = f"AutoArima({p},{d},{q})"
            else:
                self.estimator_name_ = f"AutoArima({p},{d},{q})({P},{D},{Q})[{self.m}]"

        else:
            self.model_ = arima(
                x              = y,
                m              = self.m,
                order          = self.order,
                seasonal       = self.seasonal_order,
                xreg           = exog,
                include_mean   = self.include_mean,
                transform_pars = self.transform_pars,
                fixed          = None,
                init           = None,
                method         = self.method,
                n_cond         = self.n_cond,
                SSinit         = self.SSinit,
                optim_method   = self.optim_method,
                opt_options    = self.optim_kwargs,
                kappa          = self.kappa
            )

    self.y_train_             = self.model_['y']
    self.coef_                = self.model_['coef'].to_numpy().ravel()
    self.coef_names_          = list(self.model_['coef'].columns)
    self.sigma2_              = self.model_['sigma2']
    self.loglik_              = self.model_['loglik']
    self.aic_                 = self.model_['aic']
    self.bic_                 = self.model_['bic']
    self.arma_                = self.model_['arma']
    self.converged_           = self.model_['converged']
    self.fitted_values_       = self.model_['fitted']
    self.in_sample_residuals_ = self.model_['residuals']
    self.var_coef_            = self.model_['var_coef']
    self.n_exog_names_in_     = exog_names_in_
    self.n_exog_features_in_  = exog.shape[1] if exog is not None else 0
    self.n_features_in_       = 1
    self.is_memory_reduced    = False
    self.is_fitted            = True

    if exog_names_in_ is not None:
        n_exog = len(exog_names_in_)
        self.coef_names_ = self.coef_names_[:-n_exog] + exog_names_in_

    return self

predict ¶


predict(steps, exog=None)

Generate mean forecasts steps ahead.

Parameters:

Name	Type	Description	Default
`steps`	`int`	Forecast horizon (must be > 0).	required
`exog`	`ndarray, Series or DataFrame of shape (steps, n_exog_features)`	Exogenous regressors for the forecast period. Must have the same number of features as used during fitting.	`None`

Returns:

Name	Type	Description
`predictions`	`ndarray of shape (steps,)`	Point forecasts for steps 1..steps.

Raises:

Type	Description
`ValueError`	If model hasn't been fitted, steps <= 0, or exog shape is incorrect.

Source code in skforecast\stats\_arima.py

@check_is_fitted
def predict(
    self, 
    steps: int, 
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None
) -> np.ndarray:
    """
    Generate mean forecasts steps ahead.

    Parameters
    ----------
    steps : int
        Forecast horizon (must be > 0).
    exog : ndarray, Series or DataFrame of shape (steps, n_exog_features), default None
        Exogenous regressors for the forecast period. Must have the same 
        number of features as used during fitting.

    Returns
    -------
    predictions : ndarray of shape (steps,)
        Point forecasts for steps 1..steps.

    Raises
    ------
    ValueError
        If model hasn't been fitted, steps <= 0, or exog shape is incorrect.

    """

    if not isinstance(steps, (int, np.integer)) or steps <= 0:
        raise ValueError("`steps` must be a positive integer.")

    if exog is not None:
        exog = np.asarray(exog, dtype=float)
        if exog.ndim == 1:
            exog = exog.reshape(-1, 1)
        elif exog.ndim != 2:
            raise ValueError("`exog` must be 1- or 2-dimensional.")

        if len(exog) != steps:
            raise ValueError(
                f"Length of `exog` ({len(exog)}) must match `steps` ({steps})."
            )

        if exog.shape[1] != self.n_exog_features_in_:
            raise ValueError(
                f"Number of exogenous features ({exog.shape[1]}) does not match "
                f"the number used during fitting ({self.n_exog_features_in_})."
            )
    elif self.n_exog_features_in_ > 0:
        raise ValueError(
            f"Model was fitted with {self.n_exog_features_in_} exogenous features, "
            f"but `exog` was not provided for prediction."
        )

    if self.is_auto:
        predictions = forecast_arima(
            model   = self.model_,
            h       = steps,
            xreg    = exog
        )['mean']
    else:
        predictions = predict_arima(
            model   = self.model_,
            n_ahead = steps,
            newxreg = exog,
            se_fit  = False
        )['mean']

    return predictions

predict_interval ¶


predict_interval(
    steps=1,
    level=None,
    alpha=None,
    as_frame=True,
    exog=None,
)

Forecast with prediction intervals.

Parameters:

Name	Type	Description	Default
`steps`	`int`	Forecast horizon.	`1`
`level`	`list or tuple of float`	Confidence levels in percent (e.g., 80 for 80% intervals). If None and alpha is None, defaults to (80, 95). Cannot be specified together with `alpha`.	`None`
`alpha`	`float`	The significance level for the prediction interval. If specified, the confidence interval will be (1 - alpha) * 100%. For example, alpha=0.05 gives 95% intervals. Cannot be specified together with `level`.	`None`
`as_frame`	`bool`	If True, return a tidy DataFrame with columns 'mean', 'lower_', 'upper_' for each level L. If False, return a NumPy ndarray.	`True`
`exog`	`ndarray, Series or DataFrame of shape (steps, n_exog_features)`	Exogenous regressors for the forecast period.	`None`

Returns:

Name	Type	Description
`predictions`	`numpy ndarray, pandas DataFrame`	If as_frame=True, pandas DataFrame with columns 'mean', 'lower_', 'upper_' for each level L. If as_frame=False, numpy ndarray.

Raises:

Type	Description
`ValueError`	If model hasn't been fitted, steps <= 0, or exog shape is incorrect.

Notes

Prediction intervals are computed using the standard errors from the Kalman filter and assuming normally distributed innovations. The intervals fully account for both parameter uncertainty (through the variance-covariance matrix) and forecast uncertainty.

Source code in skforecast\stats\_arima.py

@check_is_fitted
def predict_interval(
    self,
    steps: int = 1,
    level: list[float] | tuple[float, ...] | None = None,
    alpha: float | None = None,
    as_frame: bool = True,
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None
) -> np.ndarray | pd.DataFrame:
    """
    Forecast with prediction intervals.

    Parameters
    ----------
    steps : int, default 1
        Forecast horizon.
    level : list or tuple of float, default None
        Confidence levels in percent (e.g., 80 for 80% intervals).
        If None and alpha is None, defaults to (80, 95).
        Cannot be specified together with `alpha`.
    alpha : float, default None
        The significance level for the prediction interval. 
        If specified, the confidence interval will be (1 - alpha) * 100%.
        For example, alpha=0.05 gives 95% intervals.
        Cannot be specified together with `level`.
    as_frame : bool, default True
        If True, return a tidy DataFrame with columns 'mean', 'lower_<L>',
        'upper_<L>' for each level L. If False, return a NumPy ndarray.
    exog : ndarray, Series or DataFrame of shape (steps, n_exog_features), default None
        Exogenous regressors for the forecast period.

    Returns
    -------
    predictions : numpy ndarray, pandas DataFrame
        If as_frame=True, pandas DataFrame with columns 'mean', 'lower_<L>',
        'upper_<L>' for each level L. If as_frame=False, numpy ndarray.

    Raises
    ------
    ValueError
        If model hasn't been fitted, steps <= 0, or exog shape is incorrect.

    Notes
    -----
    Prediction intervals are computed using the standard errors from the 
    Kalman filter and assuming normally distributed innovations. The intervals 
    fully account for both parameter uncertainty (through the variance-covariance 
    matrix) and forecast uncertainty.

    """

    if not isinstance(steps, (int, np.integer)) or steps <= 0:
        raise ValueError("`steps` must be a positive integer.")

    if level is not None and alpha is not None:
        raise ValueError(
            "Cannot specify both `level` and `alpha`. Use one or the other."
        )

    if alpha is not None:
        if not 0 < alpha < 1:
            raise ValueError("`alpha` must be between 0 and 1.")
        level = [(1 - alpha) * 100]
    elif level is None:
        level = (80, 95)

    if isinstance(level, (int, float, np.number)):
        level = [level]
    else:
        level = list(level)

    if exog is not None:
        exog = np.asarray(exog, dtype=float)
        if exog.ndim == 1:
            exog = exog.reshape(-1, 1)
        elif exog.ndim != 2:
            raise ValueError("`exog` must be 1- or 2-dimensional.")

        if len(exog) != steps:
            raise ValueError(
                f"Length of `exog` ({len(exog)}) must match `steps` ({steps})."
            )

        if exog.shape[1] != self.n_exog_features_in_:
            raise ValueError(
                f"Number of exogenous features ({exog.shape[1]}) does not match "
                f"the number used during fitting ({self.n_exog_features_in_})."
            )
    elif self.n_exog_features_in_ > 0:
        raise ValueError(
            f"Model was fitted with {self.n_exog_features_in_} exogenous features, "
            f"but `exog` was not provided for prediction."
        )

    if self.is_auto:
        raw_preds = forecast_arima(
            model   = self.model_,
            h       = steps,
            xreg    = exog,
            level   = level
        )
    else:
        raw_preds = predict_arima(
            model   = self.model_,
            n_ahead = steps,
            newxreg = exog,
            se_fit  = True,
            level   = level
        )

    levels = raw_preds['level']
    n_levels = len(levels)
    predictions = np.empty((steps, 1 + 2 * n_levels), dtype=float)
    predictions[:, 0] = raw_preds['mean']
    predictions[:, 1::2] = raw_preds['lower']
    predictions[:, 2::2] = raw_preds['upper']

    if as_frame:
        col_names = ["mean"]
        for level in levels:
            level = int(level)
            col_names.append(f"lower_{level}")
            col_names.append(f"upper_{level}")

        predictions = pd.DataFrame(
            data    = predictions,
            columns = col_names,
            index   = pd.RangeIndex(1, steps + 1, name="step")
        )

    return predictions

get_residuals ¶


get_residuals()

Get in-sample residuals (observed - fitted) from the ARIMA model.

Returns:

Name	Type	Description
`residuals`	`ndarray of shape (n_samples,)`	In-sample residuals.

Raises:

Type	Description
`NotFittedError`	If the model has not been fitted.
`RuntimeError`	If reduce_memory() has been called (residuals are no longer available).

Source code in skforecast\stats\_arima.py

@check_is_fitted
def get_residuals(self) -> np.ndarray:
    """
    Get in-sample residuals (observed - fitted) from the ARIMA model.

    Returns
    -------
    residuals : ndarray of shape (n_samples,)
        In-sample residuals.

    Raises
    ------
    NotFittedError
        If the model has not been fitted.
    RuntimeError
        If reduce_memory() has been called (residuals are no longer available).

    """

    check_memory_reduced(self, method_name='get_residuals')
    return self.in_sample_residuals_

get_fitted_values ¶


get_fitted_values()

Get in-sample fitted values from the ARIMA model.

Returns:

Name	Type	Description
`fitted`	`ndarray of shape (n_samples,)`	In-sample fitted values.

Raises:

Type	Description
`NotFittedError`	If the model has not been fitted.
`RuntimeError`	If reduce_memory() has been called (fitted values are no longer available).

Source code in skforecast\stats\_arima.py

@check_is_fitted
def get_fitted_values(self) -> np.ndarray:
    """
    Get in-sample fitted values from the ARIMA model.

    Returns
    -------
    fitted : ndarray of shape (n_samples,)
        In-sample fitted values.

    Raises
    ------
    NotFittedError
        If the model has not been fitted.
    RuntimeError
        If reduce_memory() has been called (fitted values are no longer available).

    """

    check_memory_reduced(self, method_name='get_fitted_values')
    return self.fitted_values_

get_feature_importances ¶


get_feature_importances()

Get feature importances for Arima model.

Source code in skforecast\stats\_arima.py

@check_is_fitted
def get_feature_importances(self) -> pd.DataFrame:
    """Get feature importances for Arima model."""
    importances = pd.DataFrame({
        'feature': self.coef_names_,
        'importance': self.coef_
    })
    return importances

get_score ¶


get_score(y=None)

Compute R^2 score using in-sample fitted values.

Parameters:

Name	Type	Description	Default
`y`	`ignored`	Present for API compatibility with sklearn.	`None`

Returns:

Name	Type	Description
`score`	`float`	Coefficient of determination (R^2).

Source code in skforecast\stats\_arima.py

@check_is_fitted
def get_score(self, y: None = None) -> float:
    """
    Compute R^2 score using in-sample fitted values.

    Parameters
    ----------
    y : ignored
        Present for API compatibility with sklearn.

    Returns
    -------
    score : float
        Coefficient of determination (R^2).

    """

    check_memory_reduced(self, method_name='get_score')

    y = self.y_train_
    fitted = self.fitted_values_

    # Handle NaN values if any
    mask = ~(np.isnan(y) | np.isnan(fitted))
    if mask.sum() < 2:
        return np.nan

    ss_res = np.sum((y[mask] - fitted[mask]) ** 2)
    ss_tot = np.sum((y[mask] - y[mask].mean()) ** 2) + np.finfo(float).eps

    return 1.0 - ss_res / ss_tot

get_info_criteria ¶


get_info_criteria(criteria='aic')

Get the selected information criterion.

Parameters:

Name	Type	Description	Default
`criteria`	`str`	The information criterion to retrieve. Valid options are {'aic', 'bic'}.	`'aic'`

Returns:

Name	Type	Description
`metric`	`float`	The value of the selected information criterion.

Source code in skforecast\stats\_arima.py

@check_is_fitted
def get_info_criteria(self, criteria: str = 'aic') -> float:
    """
    Get the selected information criterion.

    Parameters
    ----------
    criteria : str, default 'aic'
        The information criterion to retrieve. Valid options are 
        {'aic', 'bic'}.

    Returns
    -------
    metric : float
        The value of the selected information criterion.

    """

    if criteria not in ['aic', 'bic']:
        raise ValueError(
            f"Invalid value for `criteria`: '{criteria}'. "
            f"Valid options are 'aic' and 'bic'."
        )

    if criteria == 'aic':
        value = self.aic_
    elif criteria == 'bic':
        # NOTE: BIC may be not available. This may occur when the model did
        # not converge or other estimation issues.
        value = self.bic_ if self.bic_ is not None else np.nan

    return value

get_params ¶


get_params(deep=True)

Get parameters for this estimator.

Parameters:

Name	Type	Description	Default
`deep`	`bool`	If True, will return the parameters for this estimator and contained subobjects that are estimators.	`True`

Returns:

Name	Type	Description
`params`	`dict`	Parameter names mapped to their values.

Source code in skforecast\stats\_arima.py

def get_params(self, deep: bool = True) -> dict:
    """
    Get parameters for this estimator.

    Parameters
    ----------
    deep : bool, default True
        If True, will return the parameters for this estimator and
        contained subobjects that are estimators.

    Returns
    -------
    params : dict
        Parameter names mapped to their values.
    """
    return {
        "order": self.order,
        "seasonal_order": self.seasonal_order,
        "m": self.m,
        "include_mean": self.include_mean,
        "transform_pars": self.transform_pars,
        "method": self.method,
        "n_cond": self.n_cond,
        "SSinit": self.SSinit,
        "optim_method": self.optim_method,
        "optim_kwargs": self.optim_kwargs,
        "kappa": self.kappa,
    }

_set_params ¶


_set_params(**params)

Set the parameters of this estimator. Internal method without resetting the fitted state. This method is intended for internal use only, please use set_params() instead.

Parameters:

Name	Type	Description	Default
`**params`	`dict`	Estimator parameters.	`{}`

Returns:

Type	Description
`None`

Source code in skforecast\stats\_arima.py

def _set_params(self, **params) -> None:
    """
    Set the parameters of this estimator. Internal method without resetting 
    the fitted state. This method is intended for internal use only, please 
    use `set_params()` instead.

    Parameters
    ----------
    **params : dict
        Estimator parameters.

    Returns
    -------
    None

    """

    for key, value in params.items():
        setattr(self, key, value)

    self.is_auto = self.order is None or self.seasonal_order is None
    if self.is_auto:
        estimator_name_ = "AutoArima()"
    else:
        p, d, q = self.order
        P, D, Q = self.seasonal_order
        if P == 0 and D == 0 and Q == 0:
            estimator_name_ = f"Arima({p},{d},{q})"
        else:
            estimator_name_ = f"Arima({p},{d},{q})({P},{D},{Q})[{self.m}]"

    self.estimator_name_ = estimator_name_

set_params ¶


set_params(**params)

Set the parameters of this estimator and reset the fitted state.

This method resets the estimator to its unfitted state whenever parameters are changed, requiring the model to be refitted before making predictions.

Parameters:

Name	Type	Description	Default
`**params`	`dict`	Estimator parameters. Valid parameter keys are: 'order', 'seasonal_order', 'm', 'include_mean', 'transform_pars', 'method', 'n_cond', 'SSinit', 'optim_method', 'optim_kwargs', 'kappa'.	`{}`

Returns:

Name	Type	Description
`self`	`Arima`	The estimator with updated parameters and reset state.

Raises:

Type	Description
`ValueError`	If any parameter key is invalid.

Source code in skforecast\stats\_arima.py

def set_params(self, **params) -> "Arima":
    """
    Set the parameters of this estimator and reset the fitted state.

    This method resets the estimator to its unfitted state whenever parameters
    are changed, requiring the model to be refitted before making predictions.

    Parameters
    ----------
    **params : dict
        Estimator parameters. Valid parameter keys are: 'order', 'seasonal_order',
        'm', 'include_mean', 'transform_pars', 'method', 'n_cond', 'SSinit',
        'optim_method', 'optim_kwargs', 'kappa'.

    Returns
    -------
    self : Arima
        The estimator with updated parameters and reset state.

    Raises
    ------
    ValueError
        If any parameter key is invalid.

    """

    valid_params = {
        'order', 'seasonal_order', 'm', 'include_mean', 'transform_pars',
        'method', 'n_cond', 'SSinit', 'optim_method', 'optim_kwargs', 'kappa'
    }
    for key in params.keys():
        if key not in valid_params:
            raise ValueError(
                f"Invalid parameter '{key}'. Valid parameters are: {valid_params}"
            )

    self._set_params(**params)

    fitted_attrs = [
        'model_', 'y_train_', 'coef_', 'coef_names_', 'sigma2_', 'loglik_',
        'aic_', 'bic_', 'arma_', 'converged_', 'fitted_values_', 'in_sample_residuals_',
        'var_coef_', 'n_features_in_', 'n_exog_features_in_', 'n_exog_names_in_'
    ]
    for attr in fitted_attrs:
        setattr(self, attr, None)

    self.is_memory_reduced = False
    self.is_fitted = False

    return self

summary ¶


summary()

Print a summary of the fitted ARIMA model. Includes model specification, coefficients, fit statistics, and residual diagnostics. If reduce_memory() has been called, summary information will be limited.

Source code in skforecast\stats\_arima.py

@check_is_fitted
def summary(self) -> None:
    """
    Print a summary of the fitted ARIMA model.
    Includes model specification, coefficients, fit statistics, and residual diagnostics.
    If reduce_memory() has been called, summary information will be limited.
    """

    print("ARIMA Model Summary")
    print("=" * 60)
    print(f"Model     : {self.estimator_name_}")
    print(f"Method    : {self.model_['method']}")
    print(f"Converged : {self.converged_}")
    print()

    print("Coefficients:")
    print("-" * 60)
    for i, name in enumerate(self.coef_names_):
        # Extract standard error from variance-covariance matrix
        if self.var_coef_ is not None and i < self.var_coef_.shape[0] and i < self.var_coef_.shape[1]:
            se = np.sqrt(self.var_coef_[i, i])
            t_stat = self.coef_[i] / se if se > 0 else np.nan
            print(f"  {name:15s}: {self.coef_[i]:10.4f}  (SE: {se:8.4f}, t: {t_stat:8.2f})")
        else:
            print(f"  {name:15s}: {self.coef_[i]:10.4f}")
    print()

    print("Model fit statistics:")
    print(f"  sigma^2:             {self.sigma2_:.6f}")
    print(f"  Log-likelihood:      {self.loglik_:.2f}")
    print(f"  AIC:                 {self.aic_:.2f}")
    if self.bic_ is not None:
        print(f"  BIC:                 {self.bic_:.2f}")
    else:
        print(f"  BIC:                 N/A")
    print()

    if not self.is_memory_reduced:
        print("Residual statistics:")
        print(f"  Mean:                {np.mean(self.in_sample_residuals_):.6f}")
        print(f"  Std Dev:             {np.std(self.in_sample_residuals_, ddof=1):.6f}")
        print(f"  MAE:                 {np.mean(np.abs(self.in_sample_residuals_)):.6f}")
        print(f"  RMSE:                {np.sqrt(np.mean(self.in_sample_residuals_**2)):.6f}")
        print()

        print("Time Series Summary Statistics:")
        print(f"Number of observations: {len(self.y_train_)}")
        print(f"  Mean:                 {np.mean(self.y_train_):.4f}")
        print(f"  Std Dev:              {np.std(self.y_train_, ddof=1):.4f}")
        print(f"  Min:                  {np.min(self.y_train_):.4f}")
        print(f"  25%:                  {np.percentile(self.y_train_, 25):.4f}")
        print(f"  Median:               {np.median(self.y_train_):.4f}")
        print(f"  75%:                  {np.percentile(self.y_train_, 75):.4f}")
        print(f"  Max:                  {np.max(self.y_train_):.4f}")

reduce_memory ¶


reduce_memory()

Free memory by deleting large attributes after fitting.

This method removes fitted values, residuals, and other intermediate results that are not strictly necessary for prediction. After calling this method, certain diagnostic functions (like get_residuals(), get_fitted_values(), summary()) will no longer work, but prediction methods will continue to function.

Call this method only if you need to reduce memory usage and don't need access to diagnostic information.

Returns:

Name	Type	Description
`self`	`Arima`	The estimator with reduced memory footprint.

Source code in skforecast\stats\_arima.py

@check_is_fitted
def reduce_memory(self) -> "Arima":
    """
    Free memory by deleting large attributes after fitting.

    This method removes fitted values, residuals, and other intermediate 
    results that are not strictly necessary for prediction. After calling 
    this method, certain diagnostic functions (like get_residuals(), 
    get_fitted_values(), summary()) will no longer work, but prediction 
    methods will continue to function.

    Call this method only if you need to reduce memory usage and don't 
    need access to diagnostic information.

    Returns
    -------
    self : Arima
        The estimator with reduced memory footprint.

    """

    attrs_to_delete = ['y_train_', 'fitted_values_', 'in_sample_residuals_']

    for attr in attrs_to_delete:
        if hasattr(self, attr):
            delattr(self, attr)

    self.is_memory_reduced = True

    warnings.warn(
        "Memory reduced. Diagnostic methods (get_residuals, get_fitted_values, "
        "summary, get_score) are no longer available. Prediction methods remain functional.",
        UserWarning
    )

    return self

skforecast.stats._sarimax.Sarimax ¶


Sarimax(
    order=(1, 0, 0),
    seasonal_order=(0, 0, 0, 0),
    trend=None,
    measurement_error=False,
    time_varying_regression=False,
    mle_regression=True,
    simple_differencing=False,
    enforce_stationarity=True,
    enforce_invertibility=True,
    hamilton_representation=False,
    concentrate_scale=False,
    trend_offset=1,
    use_exact_diffuse=False,
    dates=None,
    freq=None,
    missing="none",
    validate_specification=True,
    method="lbfgs",
    maxiter=50,
    start_params=None,
    disp=False,
    sm_init_kwargs={},
    sm_fit_kwargs={},
    sm_predict_kwargs={},
)

Bases: BaseEstimator, RegressorMixin

A universal scikit-learn style wrapper for statsmodels SARIMAX.

This class wraps the statsmodels.tsa.statespace.sarimax.SARIMAX model [1]_ [2]_ to follow the scikit-learn style. The following docstring is based on the statsmodels documentation and it is highly recommended to visit their site for the best level of detail.

Parameters:

Name	Type	Description	Default
`order`	`tuple`	The (p,d,q) order of the model for the number of AR parameters, differences, and MA parameters. `d` must be an integer indicating the integration order of the process. `p` and `q` may either be an integers indicating the AR and MA orders (so that all lags up to those orders are included) or else iterables giving specific AR and / or MA lags to include.	`(1, 0, 0)`
`seasonal_order`	`tuple`	The (P,D,Q,s) order of the seasonal component of the model for the AR parameters, differences, MA parameters, and periodicity. `D` must be an integer indicating the integration order of the process. `P` and `Q` may either be an integers indicating the AR and MA orders (so that all lags up to those orders are included) or else iterables giving specific AR and / or MA lags to include. `s` is an integer giving the periodicity (number of periods in season), often it is 4 for quarterly data or 12 for monthly data.	`(0, 0, 0, 0)`
`trend`	`str`	Parameter controlling the deterministic trend polynomial `A(t)`. `'c'` indicates a constant (i.e. a degree zero component of the trend polynomial). `'t'` indicates a linear trend with time. `'ct'` indicates both, `'c'` and `'t'`. Can also be specified as an iterable defining the non-zero polynomial exponents to include, in increasing order. For example, `[1,1,0,1]` denotes `a + b*t + ct^3`.	`None`
`measurement_error`	`bool`	Whether or not to assume the endogenous observations `y` were measured with error.	`False`
`time_varying_regression`	`bool`	Used when an explanatory variables, `exog`, are provided to select whether or not coefficients on the exogenous estimators are allowed to vary over time.	`False`
`mle_regression`	`bool`	Whether or not to use estimate the regression coefficients for the exogenous variables as part of maximum likelihood estimation or through the Kalman filter (i.e. recursive least squares). If `time_varying_regression` is `True`, this must be set to `False`.	`True`
`simple_differencing`	`bool`	Whether or not to use partially conditional maximum likelihood estimation. If `True`, differencing is performed prior to estimation, which discards the first `s*D + d` initial rows but results in a smaller state-space formulation. If `False`, the full SARIMAX model is put in state-space form so that all data points can be used in estimation.	`False`
`enforce_stationarity`	`bool`	Whether or not to transform the AR parameters to enforce stationarity in the autoregressive component of the model.	`True`
`enforce_invertibility`	`bool`	Whether or not to transform the MA parameters to enforce invertibility in the moving average component of the model.	`True`
`hamilton_representation`	`bool`	Whether or not to use the Hamilton representation of an ARMA process (if `True`) or the Harvey representation (if `False`).	`False`
`concentrate_scale`	`bool`	Whether or not to concentrate the scale (variance of the error term) out of the likelihood. This reduces the number of parameters estimated by maximum likelihood by one, but standard errors will then not be available for the scale parameter.	`False`
`trend_offset`	`int`	The offset at which to start time trend values. Default is 1, so that if `trend='t'` the trend is equal to 1, 2, ..., nobs. Typically is only set when the model created by extending a previous dataset.	`1`
`use_exact_diffuse`	`bool`	Whether or not to use exact diffuse initialization for non-stationary states. Default is `False` (in which case approximate diffuse initialization is used).	`False`
`method`	`str`	The method determines which solver from scipy.optimize is used, and it can be chosen from among the following strings: `'newton'` for Newton-Raphson `'nm'` for Nelder-Mead `'bfgs'` for Broyden-Fletcher-Goldfarb-Shanno (BFGS) `'lbfgs'` for limited-memory BFGS with optional box constraints `'powell'` for modified Powell`s method `'cg'` for conjugate gradient `'ncg'` for Newton-conjugate gradient `'basinhopping'` for global basin-hopping solver	`'lbfgs'`
`maxiter`	`int`	The maximum number of iterations to perform.	`50`
`start_params`	`numpy ndarray`	Initial guess of the solution for the loglikelihood maximization. If `None`, the default is given by estimator.start_params.	`None`
`disp`	`bool`	Set to `True` to print convergence messages.	`False`
`sm_init_kwargs`	`dict`	Additional keyword arguments to pass to the statsmodels SARIMAX model when it is initialized.	`{}`
`sm_fit_kwargs`	`dict`	Additional keyword arguments to pass to the `fit` method of the statsmodels SARIMAX model. The statsmodels SARIMAX.fit parameters `method`, `max_iter`, `start_params` and `disp` have been moved to the initialization of this model and will have priority over those provided by the user using via `sm_fit_kwargs`.	`{}`
`sm_predict_kwargs`	`dict`	Additional keyword arguments to pass to the `get_forecast` method of the statsmodels SARIMAXResults object.	`{}`

Attributes:

Name	Type	Description
`order`	`tuple`	The (p,d,q) order of the model for the number of AR parameters, differences, and MA parameters.
`seasonal_order`	`tuple`	The (P,D,Q,s) order of the seasonal component of the model for the AR parameters, differences, MA parameters, and periodicity.
`trend`	`str`	Deterministic trend polynomial `A(t)`.
`measurement_error`	`bool`	Whether or not to assume the endogenous observations `y` were measured with error.
`time_varying_regression`	`bool`	Used when an explanatory variables, `exog`, are provided to select whether or not coefficients on the exogenous estimators are allowed to vary over time.
`mle_regression`	`bool`	Whether or not to use estimate the regression coefficients for the exogenous variables as part of maximum likelihood estimation or through the Kalman filter (i.e. recursive least squares). If `time_varying_regression` is `True`, this must be set to `False`.
`simple_differencing`	`bool`	Whether or not to use partially conditional maximum likelihood estimation.
`enforce_stationarity`	`bool`	Whether or not to transform the AR parameters to enforce stationarity in the autoregressive component of the model.
`enforce_invertibility`	`bool`	Whether or not to transform the MA parameters to enforce invertibility in the moving average component of the model.
`hamilton_representation`	`bool`	Whether or not to use the Hamilton representation of an ARMA process (if `True`) or the Harvey representation (if `False`).
`concentrate_scale`	`bool`	Whether or not to concentrate the scale (variance of the error term) out of the likelihood. This reduces the number of parameters estimated by maximum likelihood by one, but standard errors will then not be available for the scale parameter.
`trend_offset`	`int`	The offset at which to start time trend values.
`use_exact_diffuse`	`bool`	Whether or not to use exact diffuse initialization for non-stationary states.
`method`	`str`	The method determines which solver from scipy.optimize is used.
`maxiter`	`int`	The maximum number of iterations to perform.
`start_params`	`numpy ndarray`	Initial guess of the solution for the loglikelihood maximization.
`disp`	`bool`	Set to `True` to print convergence messages.
`sm_init_kwargs`	`dict`	Additional keyword arguments to pass to the statsmodels SARIMAX model when it is initialized.
`sm_fit_kwargs`	`dict`	Additional keyword arguments to pass to the `fit` method of the statsmodels SARIMAX model.
`sm_predict_kwargs`	`dict`	Additional keyword arguments to pass to the `get_forecast` method of the statsmodels SARIMAXResults object.
`_sarimax_params`	`dict`	Parameters of this model that can be set with the `set_params` method.
`output_type`	`str`	Format of the object returned by the predict method. This is set automatically according to the type of `y` used in the fit method to train the model, `'numpy'` or `'pandas'`.
`sarimax`	`object`	The statsmodels.tsa.statespace.sarimax.SARIMAX object created.
`is_fitted`	`bool`	Tag to identify if the estimator has been fitted (trained).
`sarimax_res`	`object`	The resulting statsmodels.tsa.statespace.sarimax.SARIMAXResults object created by statsmodels after fitting the SARIMAX model.
`training_index`	`pandas Index`	Index of the training series as long as it is a pandas Series or Dataframe.
`estimator_name_`	`str`	String identifier of the fitted model configuration (e.g., "Sarimax(1,1,1)(0,0,0)[1]"). This is updated after fitting to reflect the selected model.

References

.. [1] Statsmodels SARIMAX API Reference. https://www.statsmodels.org/stable/generated/statsmodels.tsa.statespace.sarimax.SARIMAX.html

.. [2] Statsmodels SARIMAXResults API Reference. https://www.statsmodels.org/stable/generated/statsmodels.tsa.statespace.sarimax.SARIMAXResults.html

Methods:

Name	Description
`fit`	Fit the model to the data.
`predict`	Forecast future values and, if desired, their confidence intervals.
`append`	Recreate the results object with new data appended to the original data.
`apply`	Apply the fitted parameters to new data unrelated to the original data.
`extend`	Recreate the results object for new data that extends the original data.
`set_params`	Set new values to the parameters of the estimator.
`get_params`	Get the non trainable parameters of the estimator. This method
`params`	Get the parameters of the model. The order of variables is the trend
`summary`	Get a summary of the SARIMAXResults object.
`get_info_criteria`	Get the selected information criteria.
`get_feature_importances`	Get feature importances for SARIMAX statsmodels model.

Source code in skforecast\stats\_sarimax.py

def __init__(
    self,
    order: tuple = (1, 0, 0),
    seasonal_order: tuple = (0, 0, 0, 0),
    trend: str = None,
    measurement_error: bool = False,
    time_varying_regression: bool = False,
    mle_regression: bool = True,
    simple_differencing: bool = False,
    enforce_stationarity: bool = True,
    enforce_invertibility: bool = True,
    hamilton_representation: bool = False,
    concentrate_scale: bool = False,
    trend_offset: int = 1,
    use_exact_diffuse: bool = False,
    dates = None,
    freq = None,
    missing = 'none',
    validate_specification: bool = True,
    method: str = 'lbfgs',
    maxiter: int = 50,
    start_params: np.ndarray = None,
    disp: bool = False,
    sm_init_kwargs: dict[str, object] = {},
    sm_fit_kwargs: dict[str, object] = {},
    sm_predict_kwargs: dict[str, object] = {}
) -> None:

    self.order                   = order
    self.seasonal_order          = seasonal_order
    self.trend                   = trend
    self.measurement_error       = measurement_error
    self.time_varying_regression = time_varying_regression
    self.mle_regression          = mle_regression
    self.simple_differencing     = simple_differencing
    self.enforce_stationarity    = enforce_stationarity
    self.enforce_invertibility   = enforce_invertibility
    self.hamilton_representation = hamilton_representation
    self.concentrate_scale       = concentrate_scale
    self.trend_offset            = trend_offset
    self.use_exact_diffuse       = use_exact_diffuse
    self.dates                   = dates
    self.freq                    = freq
    self.missing                 = missing
    self.validate_specification  = validate_specification
    self.method                  = method
    self.maxiter                 = maxiter
    self.start_params            = start_params
    self.disp                    = disp

    # Create the dictionaries with the additional statsmodels parameters to be  
    # used during the init, fit and predict methods. Note that the statsmodels 
    # SARIMAX.fit parameters `method`, `max_iter`, `start_params` and `disp` 
    # have been moved to the initialization of this model and will have 
    # priority over those provided by the user using via `sm_fit_kwargs`.
    self.sm_init_kwargs    = sm_init_kwargs
    self.sm_fit_kwargs     = sm_fit_kwargs
    self.sm_predict_kwargs = sm_predict_kwargs

    # Params that can be set with the `set_params` method
    _, _, _, _sarimax_params = inspect.getargvalues(inspect.currentframe())
    self._sarimax_params = {
        k: v for k, v in _sarimax_params.items() 
        if k not in ['self', '_', '_sarimax_params']
    }

    self._consolidate_kwargs()

    # Create Results Attributes 
    self.output_type    = None
    self.sarimax        = None
    self.is_fitted      = False
    self.sarimax_res    = None
    self.training_index = None

    p, d, q = self.order
    P, D, Q, m = self.seasonal_order
    self.estimator_name_ = f"Sarimax({p},{d},{q})({P},{D},{Q})[{m}]"

order `instance-attribute` ¶


order = order

seasonal_order `instance-attribute` ¶


seasonal_order = seasonal_order

trend `instance-attribute` ¶


trend = trend

measurement_error `instance-attribute` ¶


measurement_error = measurement_error

time_varying_regression `instance-attribute` ¶


time_varying_regression = time_varying_regression

mle_regression `instance-attribute` ¶


mle_regression = mle_regression

simple_differencing `instance-attribute` ¶


simple_differencing = simple_differencing

enforce_stationarity `instance-attribute` ¶


enforce_stationarity = enforce_stationarity

enforce_invertibility `instance-attribute` ¶


enforce_invertibility = enforce_invertibility

hamilton_representation `instance-attribute` ¶


hamilton_representation = hamilton_representation

concentrate_scale `instance-attribute` ¶


concentrate_scale = concentrate_scale

trend_offset `instance-attribute` ¶


trend_offset = trend_offset

use_exact_diffuse `instance-attribute` ¶


use_exact_diffuse = use_exact_diffuse

dates `instance-attribute` ¶


dates = dates

freq `instance-attribute` ¶


freq = freq

missing `instance-attribute` ¶


missing = missing

validate_specification `instance-attribute` ¶


validate_specification = validate_specification

method `instance-attribute` ¶


method = method

maxiter `instance-attribute` ¶


maxiter = maxiter

start_params `instance-attribute` ¶


start_params = start_params

disp `instance-attribute` ¶


disp = disp

sm_init_kwargs `instance-attribute` ¶


sm_init_kwargs = sm_init_kwargs

sm_fit_kwargs `instance-attribute` ¶


sm_fit_kwargs = sm_fit_kwargs

sm_predict_kwargs `instance-attribute` ¶


sm_predict_kwargs = sm_predict_kwargs

_sarimax_params `instance-attribute` ¶


_sarimax_params = {
    k: v
    for k, v in (items())
    if k not in ["self", "_", "_sarimax_params"]
}

output_type `instance-attribute` ¶


output_type = None

sarimax `instance-attribute` ¶


sarimax = None

is_fitted `instance-attribute` ¶


is_fitted = False

sarimax_res `instance-attribute` ¶


sarimax_res = None

training_index `instance-attribute` ¶


training_index = None

estimator_name_ `instance-attribute` ¶


estimator_name_ = f'Sarimax({p},{d},{q})({P},{D},{Q})[{m}]'

_consolidate_kwargs ¶


_consolidate_kwargs()

Create the dictionaries to be used during the init, fit, and predict methods. Note that the parameters in this model's initialization take precedence over those provided by the user using via the statsmodels kwargs dicts.

Parameters:

Name	Type	Description	Default
`self`			required

Returns:

Type	Description
`None`

Source code in skforecast\stats\_sarimax.py

def _consolidate_kwargs(
    self
) -> None:
    """
    Create the dictionaries to be used during the init, fit, and predict methods.
    Note that the parameters in this model's initialization take precedence 
    over those provided by the user using via the statsmodels kwargs dicts.

    Parameters
    ----------
    self

    Returns
    -------
    None

    """

    # statsmodels.tsa.statespace.SARIMAX parameters
    _init_kwargs = self.sm_init_kwargs.copy()
    _init_kwargs.update({
       'order': self.order,
       'seasonal_order': self.seasonal_order,
       'trend': self.trend,
       'measurement_error': self.measurement_error,
       'time_varying_regression': self.time_varying_regression,
       'mle_regression': self.mle_regression,
       'simple_differencing': self.simple_differencing,
       'enforce_stationarity': self.enforce_stationarity,
       'enforce_invertibility': self.enforce_invertibility,
       'hamilton_representation': self.hamilton_representation,
       'concentrate_scale': self.concentrate_scale,
       'trend_offset': self.trend_offset,
       'use_exact_diffuse': self.use_exact_diffuse,
       'dates': self.dates,
       'freq': self.freq,
       'missing': self.missing,
       'validate_specification': self.validate_specification
    })
    self._init_kwargs = _init_kwargs

    # statsmodels.tsa.statespace.SARIMAX.fit parameters
    _fit_kwargs = self.sm_fit_kwargs.copy()
    _fit_kwargs.update({
       'method': self.method,
       'maxiter': self.maxiter,
       'start_params': self.start_params,
       'disp': self.disp,
    })        
    self._fit_kwargs = _fit_kwargs

    # statsmodels.tsa.statespace.SARIMAXResults.get_forecast parameters
    self._predict_kwargs = self.sm_predict_kwargs.copy()

_create_sarimax ¶


_create_sarimax(endog, exog=None)

A helper method to create a new statsmodel SARIMAX model.

Additional keyword arguments to pass to the statsmodels SARIMAX model when it is initialized can be added with the init_kwargs argument when initializing the model.

Parameters:

Name	Type	Description	Default
`endog`	`numpy ndarray, pandas Series, pandas DataFrame`	The endogenous variable.	required
`exog`	`numpy ndarray, pandas Series, pandas DataFrame`	The exogenous variables.	`None`

Returns:

Type	Description
`None`

Source code in skforecast\stats\_sarimax.py

def _create_sarimax(
    self,
    endog: np.ndarray | pd.Series | pd.DataFrame,
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None
) -> None:
    """
    A helper method to create a new statsmodel SARIMAX model.

    Additional keyword arguments to pass to the statsmodels SARIMAX model 
    when it is initialized can be added with the `init_kwargs` argument 
    when initializing the model.

    Parameters
    ----------
    endog : numpy ndarray, pandas Series, pandas DataFrame
        The endogenous variable.
    exog : numpy ndarray, pandas Series, pandas DataFrame, default None
        The exogenous variables.

    Returns
    -------
    None

    """

    self.sarimax = SARIMAX(endog=endog, exog=exog, **self._init_kwargs)

fit ¶


fit(y, exog=None)

Fit the model to the data.

Additional keyword arguments to pass to the fit method of the statsmodels SARIMAX model can be added with the fit_kwargs argument when initializing the model.

Parameters:

Name	Type	Description	Default
`y`	`numpy ndarray, pandas Series, pandas DataFrame`	Training time series.	required
`exog`	`numpy ndarray, pandas Series, pandas DataFrame`	Exogenous variable/s included as predictor/s. Must have the same number of observations as `y` and their indexes must be aligned so that y[i] is regressed on exog[i].	`None`

Returns:

Type	Description
`None`

Source code in skforecast\stats\_sarimax.py

def fit(
    self,
    y: np.ndarray | pd.Series | pd.DataFrame,
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None
) -> None:
    """
    Fit the model to the data.

    Additional keyword arguments to pass to the `fit` method of the
    statsmodels SARIMAX model can be added with the `fit_kwargs` argument 
    when initializing the model.

    Parameters
    ----------
    y : numpy ndarray, pandas Series, pandas DataFrame
        Training time series.
    exog : numpy ndarray, pandas Series, pandas DataFrame, default None
        Exogenous variable/s included as predictor/s. Must have the same
        number of observations as `y` and their indexes must be aligned so
        that y[i] is regressed on exog[i].

    Returns
    -------
    None

    """

    # Reset values in case the model has already been fitted.
    self.output_type    = None
    self.sarimax_res    = None
    self.is_fitted      = False
    self.training_index = None

    self.output_type = 'numpy' if isinstance(y, np.ndarray) else 'pandas'

    self._create_sarimax(endog=y, exog=exog)
    self.sarimax_res = self.sarimax.fit(**self._fit_kwargs)
    self.is_fitted = True

    if self.output_type == 'pandas':
        self.training_index = y.index

predict ¶


predict(
    steps, exog=None, return_conf_int=False, alpha=0.05
)

Forecast future values and, if desired, their confidence intervals.

Generate predictions (forecasts) n steps in the future with confidence intervals. Note that if exogenous variables were used in the model fit, they will be expected for the predict procedure and will fail otherwise.

Additional keyword arguments to pass to the get_forecast method of the statsmodels SARIMAX model can be added with the predict_kwargs argument when initializing the model.

Parameters:

Name	Type	Description	Default
`steps`	`int`	Number of steps to predict.	required
`exog`	`numpy ndarray, pandas Series, pandas DataFrame`	Value of the exogenous variable/s for the next steps. The number of observations needed is the number of steps to predict.	`None`
`return_conf_int`	`bool`	Whether to get the confidence intervals of the forecasts.	`False`
`alpha`	`float`	The confidence intervals for the forecasts are (1 - alpha) %.	`0.05`

Returns:

Name	Type	Description
`predictions`	`numpy ndarray, pandas DataFrame`	Values predicted by the forecaster and their estimated interval. The output type is the same as the type of `y` used in the fit method. pred: predictions. lower_bound: lower bound of the interval. (if `return_conf_int`) upper_bound: upper bound of the interval. (if `return_conf_int`)

Source code in skforecast\stats\_sarimax.py

@check_is_fitted
def predict(
    self,
    steps: int,
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None, 
    return_conf_int: bool = False,
    alpha: float = 0.05
) -> np.ndarray | pd.DataFrame:
    """
    Forecast future values and, if desired, their confidence intervals.

    Generate predictions (forecasts) n steps in the future with confidence
    intervals. Note that if exogenous variables were used in the model fit, 
    they will be expected for the predict procedure and will fail otherwise.

    Additional keyword arguments to pass to the `get_forecast` method of the
    statsmodels SARIMAX model can be added with the `predict_kwargs` argument 
    when initializing the model.

    Parameters
    ----------
    steps : int
        Number of steps to predict. 
    exog : numpy ndarray, pandas Series, pandas DataFrame, default None
        Value of the exogenous variable/s for the next steps. The number of 
        observations needed is the number of steps to predict. 
    return_conf_int : bool, default False
        Whether to get the confidence intervals of the forecasts.
    alpha : float, default 0.05
        The confidence intervals for the forecasts are (1 - alpha) %.

    Returns
    -------
    predictions : numpy ndarray, pandas DataFrame
        Values predicted by the forecaster and their estimated interval. The 
        output type is the same as the type of `y` used in the fit method.

        - pred: predictions.
        - lower_bound: lower bound of the interval. (if `return_conf_int`)
        - upper_bound: upper bound of the interval. (if `return_conf_int`)

    """

    # This is done because statsmodels doesn't allow `exog` length greater than
    # the number of steps
    if exog is not None and len(exog) > steps:
        warnings.warn(
            f"When predicting using exogenous variables, the `exog` parameter "
            f"must have the same length as the number of predicted steps. Since "
            f"len(exog) > steps, only the first {steps} observations are used."
        )
        exog = exog[:steps]

    predictions = self.sarimax_res.get_forecast(
                      steps = steps,
                      exog  = exog,
                      **self._predict_kwargs
                  )

    if not return_conf_int:
        predictions = predictions.predicted_mean
        if self.output_type == 'pandas':
            predictions = predictions.rename("pred").to_frame()
    else:
        if self.output_type == 'numpy':
            predictions = np.column_stack(
                              [predictions.predicted_mean,
                               predictions.conf_int(alpha=alpha)]
                          )
        else:
            predictions = pd.concat((
                              predictions.predicted_mean,
                              predictions.conf_int(alpha=alpha)),
                              axis = 1
                          )
            predictions.columns = ['pred', 'lower_bound', 'upper_bound']

    return predictions

append ¶


append(
    y,
    exog=None,
    refit=False,
    copy_initialization=False,
    **kwargs
)

Recreate the results object with new data appended to the original data.

Creates a new result object applied to a dataset that is created by appending new data to the end of the model's original data [1]_. The new results can then be used for analysis or forecasting.

Parameters:

Name	Type	Description	Default
`y`	`numpy ndarray, pandas Series, pandas DataFrame`	New observations from the modeled time-series process.	required
`exog`	`numpy ndarray, pandas Series, pandas DataFrame`	New observations of exogenous estimators, if applicable. Must have the same number of observations as `y` and their indexes must be aligned so that y[i] is regressed on exog[i].	`None`
`refit`	`bool`	Whether to re-fit the parameters, based on the combined dataset.	`False`
`copy_initialization`	`bool`	Whether or not to copy the initialization from the current results set to the new model.	`False`
`**kwargs`		Keyword arguments may be used to modify model specification arguments when created the new model object.	`{}`

Returns:

Type	Description
`None`

Notes

The y and exog arguments to this method must be formatted in the same way (e.g. Pandas Series versus Numpy array) as were the y and exog arrays passed to the original model.

The y argument to this method should consist of new observations that occurred directly after the last element of y. For any other kind of dataset, see the apply method.

This method will apply filtering to all of the original data as well as to the new data. To apply filtering only to the new data (which can be much faster if the original dataset is large), see the extend method.

References

.. [1] Statsmodels MLEResults append API Reference. https://www.statsmodels.org/stable/generated/statsmodels.tsa.statespace.mlemodel.MLEResults.append.html#statsmodels.tsa.statespace.mlemodel.MLEResults.append

Source code in skforecast\stats\_sarimax.py

@check_is_fitted
def append(
    self,
    y: np.ndarray | pd.Series | pd.DataFrame,
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None,
    refit: bool = False,
    copy_initialization: bool = False,
    **kwargs
) -> None:
    """
    Recreate the results object with new data appended to the original data.

    Creates a new result object applied to a dataset that is created by 
    appending new data to the end of the model's original data [1]_. The new 
    results can then be used for analysis or forecasting.

    Parameters
    ----------
    y : numpy ndarray, pandas Series, pandas DataFrame
        New observations from the modeled time-series process.
    exog : numpy ndarray, pandas Series, pandas DataFrame, default None
        New observations of exogenous estimators, if applicable. Must have 
        the same number of observations as `y` and their indexes must be 
        aligned so that y[i] is regressed on exog[i].
    refit : bool, default False
        Whether to re-fit the parameters, based on the combined dataset.
    copy_initialization : bool, default False
        Whether or not to copy the initialization from the current results 
        set to the new model. 
    **kwargs
        Keyword arguments may be used to modify model specification arguments 
        when created the new model object.

    Returns
    -------
    None

    Notes
    -----
    The `y` and `exog` arguments to this method must be formatted in the same 
    way (e.g. Pandas Series versus Numpy array) as were the `y` and `exog` 
    arrays passed to the original model.

    The `y` argument to this method should consist of new observations that 
    occurred directly after the last element of `y`. For any other kind of 
    dataset, see the apply method.

    This method will apply filtering to all of the original data as well as 
    to the new data. To apply filtering only to the new data (which can be 
    much faster if the original dataset is large), see the extend method.

    References
    ----------
    .. [1] Statsmodels MLEResults append API Reference.
           https://www.statsmodels.org/stable/generated/statsmodels.tsa.statespace.mlemodel.MLEResults.append.html#statsmodels.tsa.statespace.mlemodel.MLEResults.append

    """

    fit_kwargs = self._fit_kwargs if refit else None

    self.sarimax_res = self.sarimax_res.append(
                           endog               = y,
                           exog                = exog,
                           refit               = refit,
                           copy_initialization = copy_initialization,
                           fit_kwargs          = fit_kwargs,
                           **kwargs
                       )

apply ¶


apply(
    y,
    exog=None,
    refit=False,
    copy_initialization=False,
    **kwargs
)

Apply the fitted parameters to new data unrelated to the original data.

Creates a new result object using the current fitted parameters, applied to a completely new dataset that is assumed to be unrelated to the model's original data [1]_. The new results can then be used for analysis or forecasting.

Parameters:

Name	Type	Description	Default
`y`	`numpy ndarray, pandas Series, pandas DataFrame`	New observations from the modeled time-series process.	required
`exog`	`numpy ndarray, pandas Series, pandas DataFrame`	New observations of exogenous estimators, if applicable. Must have the same number of observations as `y` and their indexes must be aligned so that y[i] is regressed on exog[i].	`None`
`refit`	`bool`	Whether to re-fit the parameters, using the new dataset.	`False`
`copy_initialization`	`bool`	Whether or not to copy the initialization from the current results set to the new model.	`False`
`**kwargs`		Keyword arguments may be used to modify model specification arguments when created the new model object.	`{}`

Returns:

Type	Description
`None`

Notes

The y argument to this method should consist of new observations that are not necessarily related to the original model's y dataset. For observations that continue that original dataset by follow directly after its last element, see the append and extend methods.

References

.. [1] Statsmodels MLEResults apply API Reference. https://www.statsmodels.org/stable/generated/statsmodels.tsa.statespace.mlemodel.MLEResults.apply.html#statsmodels.tsa.statespace.mlemodel.MLEResults.apply

Source code in skforecast\stats\_sarimax.py

@check_is_fitted
def apply(
    self,
    y: np.ndarray | pd.Series | pd.DataFrame,
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None,
    refit: bool = False,
    copy_initialization: bool = False,
    **kwargs
) -> None:
    """
    Apply the fitted parameters to new data unrelated to the original data.

    Creates a new result object using the current fitted parameters, applied 
    to a completely new dataset that is assumed to be unrelated to the model's
    original data [1]_. The new results can then be used for analysis or forecasting.

    Parameters
    ----------
    y : numpy ndarray, pandas Series, pandas DataFrame
        New observations from the modeled time-series process.
    exog : numpy ndarray, pandas Series, pandas DataFrame, default None
        New observations of exogenous estimators, if applicable. Must have 
        the same number of observations as `y` and their indexes must be 
        aligned so that y[i] is regressed on exog[i].
    refit : bool, default False
        Whether to re-fit the parameters, using the new dataset.
    copy_initialization : bool, default False
        Whether or not to copy the initialization from the current results 
        set to the new model. 
    **kwargs
        Keyword arguments may be used to modify model specification arguments 
        when created the new model object.

    Returns
    -------
    None

    Notes
    -----
    The `y` argument to this method should consist of new observations that 
    are not necessarily related to the original model's `y` dataset. For 
    observations that continue that original dataset by follow directly after 
    its last element, see the append and extend methods.

    References
    ----------
    .. [1] Statsmodels MLEResults apply API Reference.
           https://www.statsmodels.org/stable/generated/statsmodels.tsa.statespace.mlemodel.MLEResults.apply.html#statsmodels.tsa.statespace.mlemodel.MLEResults.apply

    """

    fit_kwargs = self._fit_kwargs if refit else None

    self.sarimax_res = self.sarimax_res.apply(
                           endog               = y,
                           exog                = exog,
                           refit               = refit,
                           copy_initialization = copy_initialization,
                           fit_kwargs          = fit_kwargs,
                           **kwargs
                       )

extend ¶


extend(y, exog=None, **kwargs)

Recreate the results object for new data that extends the original data.

Creates a new result object applied to a new dataset that is assumed to follow directly from the end of the model's original data [1]_. The new results can then be used for analysis or forecasting.

Parameters:

Name	Type	Description	Default
`y`	`numpy ndarray, pandas Series, pandas DataFrame`	New observations from the modeled time-series process.	required
`exog`	`numpy ndarray, pandas Series, pandas DataFrame`	New observations of exogenous estimators, if applicable. Must have the same number of observations as `y` and their indexes must be aligned so that y[i] is regressed on exog[i].	`None`
`**kwargs`		Keyword arguments may be used to modify model specification arguments when created the new model object.	`{}`

Returns:

Type	Description
`None`

Notes

The y argument to this method should consist of new observations that occurred directly after the last element of the model's original y array. For any other kind of dataset, see the apply method.

This method will apply filtering only to the new data provided by the y argument, which can be much faster than re-filtering the entire dataset. However, the returned results object will only have results for the new data. To retrieve results for both the new data and the original data, see the append method.

References

.. [1] Statsmodels MLEResults extend API Reference. https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.mlemodel.MLEResults.extend.html#statsmodels.tsa.statespace.mlemodel.MLEResults.extend

Source code in skforecast\stats\_sarimax.py

@check_is_fitted
def extend(
    self,
    y: np.ndarray | pd.Series | pd.DataFrame,
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None,
    **kwargs
) -> None:
    """
    Recreate the results object for new data that extends the original data.

    Creates a new result object applied to a new dataset that is assumed to 
    follow directly from the end of the model's original data [1]_. The new 
    results can then be used for analysis or forecasting.

    Parameters
    ----------
    y : numpy ndarray, pandas Series, pandas DataFrame
        New observations from the modeled time-series process.
    exog : numpy ndarray, pandas Series, pandas DataFrame, default None
        New observations of exogenous estimators, if applicable. Must have 
        the same number of observations as `y` and their indexes must be 
        aligned so that y[i] is regressed on exog[i].
    **kwargs
        Keyword arguments may be used to modify model specification arguments 
        when created the new model object.

    Returns
    -------
    None

    Notes
    -----
    The `y` argument to this method should consist of new observations that 
    occurred directly after the last element of the model's original `y` 
    array. For any other kind of dataset, see the apply method.

    This method will apply filtering only to the new data provided by the `y` 
    argument, which can be much faster than re-filtering the entire dataset. 
    However, the returned results object will only have results for the new 
    data. To retrieve results for both the new data and the original data, 
    see the append method.

    References
    ----------
    .. [1] Statsmodels MLEResults extend API Reference.
           https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.mlemodel.MLEResults.extend.html#statsmodels.tsa.statespace.mlemodel.MLEResults.extend

    """

    self.sarimax_res = self.sarimax_res.extend(
                           endog = y,
                           exog  = exog,
                           **kwargs
                       )

set_params ¶


set_params(**params)

Set new values to the parameters of the estimator.

Parameters:

Name	Type	Description	Default
`params`	`dict`	Parameters values.	`{}`

Returns:

Type	Description
`None`

Source code in skforecast\stats\_sarimax.py

def set_params(
    self, 
    **params: dict[str, object]
) -> None:
    """
    Set new values to the parameters of the estimator.

    Parameters
    ----------
    params : dict
        Parameters values.

    Returns
    -------
    None

    """

    params = {k: v for k, v in params.items() if k in self._sarimax_params}
    for key, value in params.items():
        setattr(self, key, value)
        self._sarimax_params[key] = value

    self._consolidate_kwargs()

    # Reset values in case the model has already been fitted.
    self.output_type    = None
    self.sarimax_res    = None
    self.is_fitted      = False
    self.training_index = None

get_params ¶


get_params(deep=True)

Get the non trainable parameters of the estimator. This method is different from the params method, which returns the parameters of the fitted model.

Parameters:

Name	Type	Description	Default
`deep`	`bool`	If `True`, will return the parameters for this estimator and contained subobjects that are estimators.	`True`

Returns:

Name	Type	Description
`params`	`dict`	Parameters of the estimator.

Source code in skforecast\stats\_sarimax.py

def get_params(
    self, 
    deep: bool = True
) -> dict[str, object]:
    """
    Get the non trainable parameters of the estimator. This method
    is different from the `params` method, which returns the parameters
    of the fitted model.

    Parameters
    ----------
    deep : bool, default True
        If `True`, will return the parameters for this estimator and 
        contained subobjects that are estimators.

    Returns
    -------
    params : dict
        Parameters of the estimator.

    """

    return self._sarimax_params.copy()

params ¶


params()

Get the parameters of the model. The order of variables is the trend coefficients, the k_exog exogenous coefficients, the k_ar AR coefficients, and finally the k_ma MA coefficients.

Returns:

Name	Type	Description
`params`	`numpy ndarray, pandas Series`	The parameters of the model.

Source code in skforecast\stats\_sarimax.py

@check_is_fitted
def params(
    self
) -> np.ndarray | pd.Series:
    """
    Get the parameters of the model. The order of variables is the trend
    coefficients, the `k_exog` exogenous coefficients, the `k_ar` AR 
    coefficients, and finally the `k_ma` MA coefficients.

    Returns
    -------
    params : numpy ndarray, pandas Series
        The parameters of the model.

    """

    return self.sarimax_res.params

summary ¶


summary(alpha=0.05, start=None)

Get a summary of the SARIMAXResults object.

Parameters:

Name	Type	Description	Default
`alpha`	`float`	The confidence intervals for the forecasts are (1 - alpha) %.	`0.05`
`start`	`int`	Integer of the start observation.	`None`

Returns:

Name	Type	Description
`summary`	`Summary instance`	This holds the summary table and text, which can be printed or converted to various output formats.

Source code in skforecast\stats\_sarimax.py

@check_is_fitted
def summary(
    self,
    alpha: float = 0.05,
    start: int = None
) -> object:
    """
    Get a summary of the SARIMAXResults object.

    Parameters
    ----------
    alpha : float, default 0.05
        The confidence intervals for the forecasts are (1 - alpha) %.
    start : int, default None
        Integer of the start observation.

    Returns
    -------
    summary : Summary instance
        This holds the summary table and text, which can be printed or 
        converted to various output formats.

    """

    return self.sarimax_res.summary(alpha=alpha, start=start)

get_info_criteria ¶


get_info_criteria(criteria='aic', method='standard')

Get the selected information criteria.

Check https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAXResults.info_criteria.html to know more about statsmodels info_criteria method.

Parameters:

Name	Type	Description	Default
`criteria`	`str`	The information criteria to compute. Valid options are {'aic', 'bic', 'hqic'}.	`'aic'`
`method`	`str`	The method for information criteria computation. Default is 'standard' method; 'lutkepohl' computes the information criteria as in Lütkepohl (2007).	`'standard'`

Returns:

Name	Type	Description
`metric`	`float`	The value of the selected information criteria.

Source code in skforecast\stats\_sarimax.py

@check_is_fitted
def get_info_criteria(
    self,
    criteria: str = 'aic',
    method: str = 'standard'
) -> float:
    """
    Get the selected information criteria.

    Check https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAXResults.info_criteria.html
    to know more about statsmodels info_criteria method.

    Parameters
    ----------
    criteria : str, default 'aic'
        The information criteria to compute. Valid options are {'aic', 'bic',
        'hqic'}.
    method : str, default 'standard'
        The method for information criteria computation. Default is 'standard'
        method; 'lutkepohl' computes the information criteria as in Lütkepohl
        (2007).

    Returns
    -------
    metric : float
        The value of the selected information criteria.

    """

    if criteria not in ['aic', 'bic', 'hqic']:
        raise ValueError(
            "Invalid value for `criteria`. Valid options are 'aic', 'bic', "
            "and 'hqic'."
        )

    if method not in ['standard', 'lutkepohl']:
        raise ValueError(
            "Invalid value for `method`. Valid options are 'standard' and "
            "'lutkepohl'."
        )

    metric = self.sarimax_res.info_criteria(criteria=criteria, method=method)

    return metric

get_feature_importances ¶


get_feature_importances()

Get feature importances for SARIMAX statsmodels model.

Source code in skforecast\stats\_sarimax.py

@check_is_fitted
def get_feature_importances(self) -> pd.DataFrame:
    """Get feature importances for SARIMAX statsmodels model."""

    feature_importances = self.params().to_frame().reset_index()
    feature_importances.columns = ['feature', 'importance']

    return feature_importances

skforecast.stats._ets.Ets ¶


Ets(
    m=1,
    model="ZZZ",
    damped=None,
    alpha=None,
    beta=None,
    gamma=None,
    phi=None,
    lambda_param=None,
    lambda_auto=False,
    bias_adjust=True,
    bounds="both",
    seasonal=True,
    trend=None,
    ic="aicc",
    allow_multiplicative=True,
    allow_multiplicative_trend=False,
)

Bases: BaseEstimator, RegressorMixin

Scikit-learn style wrapper for the ETS (Error, Trend, Seasonality) model.

This estimator treats a univariate time series as input. Call fit(y) with a 1D array-like of observations in time order, then produce out-of-sample forecasts via predict(steps) and prediction intervals via predict_interval(steps, level=...). In-sample diagnostics are available through fitted_, residuals_() and summary().

Parameters:

Name	Type	Description	Default
`m`	`int`	Seasonal period (e.g., 12 for monthly data with yearly seasonality).	`1`
`model`	`(str, None)`	Three-letter model specification (e.g., "ANN", "AAA", "MAM"): - First letter: Error type (A=Additive, M=Multiplicative, Z=Auto) - Second letter: Trend type (N=None, A=Additive, M=Multiplicative, Z=Auto) - Third letter: Season type (N=None, A=Additive, M=Multiplicative, Z=Auto) Use "ZZZ" or None for automatic model selection.	`"ZZZ"`
`damped`	`bool or None`	Whether to use damped trend. If None, both damped and non-damped models are tried (only when model="ZZZ" or model=None).	`None`
`alpha`	`float`	Smoothing parameter for level (0 < alpha < 1). If None, estimated.	`None`
`beta`	`float`	Smoothing parameter for trend (0 < beta < alpha). If None, estimated.	`None`
`gamma`	`float`	Smoothing parameter for seasonality (0 < gamma < 1-alpha). If None, estimated.	`None`
`phi`	`float`	Damping parameter (0 < phi < 1). If None, estimated.	`None`
`lambda_param`	`float`	Box-Cox transformation parameter. If None, no transformation applied.	`None`
`lambda_auto`	`bool`	If True, automatically select optimal Box-Cox lambda parameter.	`False`
`bias_adjust`	`bool`	Apply bias adjustment when back-transforming forecasts.	`True`
`bounds`	`str`	Parameter bounds type: "usual", "admissible", or "both".	`"both"`
`seasonal`	`bool`	Allow seasonal models (only used with model="ZZZ" or model=None).	`True`
`trend`	`bool`	Allow trend models. If None, automatically determined (only with model="ZZZ" or model=None).	`None`
`ic`	`('aic', 'aicc', 'bic')`	Information criterion for model selection (only with model="ZZZ" or model=None).	`"aic"`
`allow_multiplicative`	`bool`	Allow multiplicative error and season models (only with model="ZZZ" or model=None).	`True`
`allow_multiplicative_trend`	`bool`	Allow multiplicative trend models (only with model="ZZZ" or model=None).	`False`

Attributes:

Name	Type	Description
`m`	`int`	Seasonal period (e.g., 12 for monthly data with yearly seasonality).
`model`	`str`	Three-letter model specification (e.g., "ANN", "AAA", "MAM"). Each letter represents error, trend, and season types respectively, using A (Additive), M (Multiplicative), N (None), or Z (Auto-select).
`damped`	`bool or None`	Whether to apply damping to the trend component. If None with model="ZZZ" or model=None, both damped and non-damped models are evaluated during automatic selection.
`alpha`	`float or None`	User-provided smoothing parameter for the level component (0 < alpha < 1). When None, the parameter is estimated during fitting.
`beta`	`float or None`	User-provided smoothing parameter for the trend component (0 < beta < alpha). When None, the parameter is estimated during fitting if trend is present.
`gamma`	`float or None`	User-provided smoothing parameter for the seasonal component (0 < gamma < 1-alpha). When None, the parameter is estimated during fitting if seasonality is present.
`phi`	`float or None`	User-provided damping parameter (0 < phi < 1). When None, the parameter is estimated during fitting if damped trend is used.
`lambda_param`	`float or None`	Box-Cox transformation parameter applied to the time series before fitting. When None, no transformation is applied unless lambda_auto is True.
`lambda_auto`	`bool`	Whether to automatically determine the optimal Box-Cox transformation parameter during model fitting.
`bias_adjust`	`bool`	Whether to apply bias adjustment when back-transforming forecasts from the Box-Cox transformed scale to the original scale.
`bounds`	`str`	Type of parameter bounds used during optimization: "usual" for standard bounds, "admissible" for stability-ensuring bounds, or "both" for their intersection.
`seasonal`	`bool`	Whether seasonal models are considered during automatic model selection (only applicable when model="ZZZ" or model=None).
`trend`	`bool or None`	Whether trend models are considered during automatic model selection. When None with model="ZZZ" or model=None, this is determined automatically based on the data.
`ic`	`{'aic', 'aicc', 'bic'}`	Information criterion used to compare and select the best model during automatic model selection (only applicable when model="ZZZ" or model=None).
`allow_multiplicative`	`bool`	Whether multiplicative error and seasonal components are allowed during automatic model selection (only applicable when model="ZZZ" or model=None).
`allow_multiplicative_trend`	`bool`	Whether multiplicative trend components are allowed during automatic model selection (only applicable when model="ZZZ" or model=None).
`model_`	`ETSModel or None`	The fitted ETS model object containing parameters, diagnostics, and state space representation. Available after calling `fit()`.
`model_config_`	`dict or None`	Dictionary containing the model configuration including error type, trend type, seasonal type, damping flag, and seasonal period. Available after calling `fit()`.
`params_`	`dict or None`	Dictionary of estimated model parameters including smoothing parameters (alpha, beta, gamma, phi) and initial state values. Available after calling `fit()`.
`aic_`	`float or None`	Akaike Information Criterion of the fitted model, measuring the quality of fit while penalizing model complexity. Available after calling `fit()`.
`bic_`	`float or None`	Bayesian Information Criterion of the fitted model, similar to AIC but with a stronger penalty for model complexity. Available after calling `fit()`.
`y_train_`	`ndarray of shape (n_samples,) or None`	The original training time series used to fit the model.
`fitted_values_`	`ndarray of shape (n_samples,) or None`	One-step-ahead in-sample fitted values from the model.
`in_sample_residuals_`	`ndarray of shape (n_samples,) or None`	In-sample residuals calculated as the difference between observed values and fitted values.
`n_features_in_`	`int or None`	Number of features (time series) seen during `fit()`. For ETS, this is always 1 as it handles univariate time series. Available after calling `fit()`.
`is_memory_reduced`	`bool`	Flag indicating whether `reduce_memory()` has been called to clear diagnostic arrays (y_train_, fitted_values_, in_sample_residuals_).
`is_fitted`	`bool`	Flag indicating whether the model has been successfully fitted to data.
`estimator_name_`	`str`	String identifier of the fitted model configuration (e.g., "Ets(AAA)"). This is updated after fitting to reflect the selected model.
`is_auto`	`bool`	Indicates whether automatic model selection was used (model="ZZZ" or model=None).
`best_params_`	`dict or None`	If automatic model selection was used (model="ZZZ" or model=None), this dictionary contains the parameters of the selected best model. Otherwise, it is None.

Methods:

Name	Description
`fit`	Fit the ETS model to a univariate time series.
`predict`	Generate mean forecasts steps ahead.
`predict_interval`	Forecast with prediction intervals.
`get_residuals`	Get in-sample residuals (observed - fitted) from the ETS model.
`get_fitted_values`	Get in-sample fitted values from the ETS model.
`get_score`	R^2 using in-sample fitted values.
`get_params`	Get parameters for this estimator.
`get_feature_importances`	Get feature importances for Eta model.
`get_info_criteria`	Get information criteria.
`set_params`	Set the parameters of this estimator and reset the fitted state.
`summary`	Print a summary of the fitted ETS model.
`reduce_memory`	Reduce memory usage by removing internal arrays not needed for prediction.

Source code in skforecast\stats\_ets.py

def __init__(
    self,
    m: int = 1,
    model: str | None = "ZZZ",
    damped: bool | None = None,
    alpha: float | None = None,
    beta: float | None = None,
    gamma: float | None = None,
    phi: float | None = None,
    lambda_param: float | None = None,
    lambda_auto: bool = False,
    bias_adjust: bool = True,
    bounds: str = "both",
    seasonal: bool = True,
    trend: bool | None = None,
    ic: Literal["aic", "aicc", "bic"] = "aicc",
    allow_multiplicative: bool = True,
    allow_multiplicative_trend: bool = False,
):

    if not isinstance(m, int) or m < 1:
        raise ValueError(
            f"`m` must be a positive integer greater than or equal to 1. "
            f"Got {m}."
        )

    self.m                          = m
    self.model                      = model if model is not None else "ZZZ"
    self.damped                     = damped
    self.alpha                      = alpha
    self.beta                       = beta
    self.gamma                      = gamma
    self.phi                        = phi
    self.lambda_param               = lambda_param
    self.lambda_auto                = lambda_auto
    self.bias_adjust                = bias_adjust
    self.bounds                     = bounds
    self.seasonal                   = seasonal
    self.trend                      = trend
    self.ic                         = ic
    self.allow_multiplicative       = allow_multiplicative
    self.allow_multiplicative_trend = allow_multiplicative_trend

    self.model_                     = None
    self.model_config_              = None
    self.params_                    = None
    self.aic_                       = None
    self.bic_                       = None
    self.y_train_                   = None
    self.fitted_values_             = None
    self.in_sample_residuals_       = None
    self.n_features_in_             = None
    self.is_memory_reduced          = False
    self.is_fitted                  = False
    self.best_params_               = None
    self.is_auto                    = self.model == "ZZZ"

    if self.is_auto:
        self.estimator_name_ = "AutoEts()"
    else:
        self.estimator_name_ = f"Ets({self.model})"

m `instance-attribute` ¶


m = m

model `instance-attribute` ¶


model = model if model is not None else 'ZZZ'

damped `instance-attribute` ¶


damped = damped

alpha `instance-attribute` ¶


alpha = alpha

beta `instance-attribute` ¶


beta = beta

gamma `instance-attribute` ¶


gamma = gamma

phi `instance-attribute` ¶


phi = phi

lambda_param `instance-attribute` ¶


lambda_param = lambda_param

lambda_auto `instance-attribute` ¶


lambda_auto = lambda_auto

bias_adjust `instance-attribute` ¶


bias_adjust = bias_adjust

bounds `instance-attribute` ¶


bounds = bounds

seasonal `instance-attribute` ¶


seasonal = seasonal

trend `instance-attribute` ¶


trend = trend

ic `instance-attribute` ¶


ic = ic

allow_multiplicative `instance-attribute` ¶


allow_multiplicative = allow_multiplicative

allow_multiplicative_trend `instance-attribute` ¶


allow_multiplicative_trend = allow_multiplicative_trend

model_ `instance-attribute` ¶


model_ = None

model_config_ `instance-attribute` ¶


model_config_ = None

params_ `instance-attribute` ¶


params_ = None

aic_ `instance-attribute` ¶


aic_ = None

bic_ `instance-attribute` ¶


bic_ = None

y_train_ `instance-attribute` ¶


y_train_ = None

fitted_values_ `instance-attribute` ¶


fitted_values_ = None

in_sample_residuals_ `instance-attribute` ¶


in_sample_residuals_ = None

n_features_in_ `instance-attribute` ¶


n_features_in_ = None

is_memory_reduced `instance-attribute` ¶


is_memory_reduced = False

is_fitted `instance-attribute` ¶


is_fitted = False

best_params_ `instance-attribute` ¶


best_params_ = None

is_auto `instance-attribute` ¶


is_auto = model == 'ZZZ'

estimator_name_ `instance-attribute` ¶


estimator_name_ = 'AutoEts()'

fit ¶


fit(y, exog=None)

Fit the ETS model to a univariate time series.

Parameters:

Name	Type	Description	Default
`y`	`array-like of shape (n_samples,)`	Time-ordered numeric sequence.	required
`exog`	`Ignored`	Exogenous variables. Ignored, present for API compatibility.	`None`

Returns:

Name	Type	Description
`self`	`Ets`	Fitted estimator.

Source code in skforecast\stats\_ets.py

def fit(self, y: pd.Series | np.ndarray, exog: None = None) -> Ets:
    """
    Fit the ETS model to a univariate time series.

    Parameters
    ----------
    y : array-like of shape (n_samples,)
        Time-ordered numeric sequence.
    exog : Ignored
        Exogenous variables. Ignored, present for API compatibility.

    Returns
    -------
    self : Ets
        Fitted estimator.

    """

    self.model_               = None
    self.model_config_        = None
    self.params_              = None
    self.aic_                 = None
    self.bic_                 = None
    self.y_train_             = None
    self.fitted_values_       = None
    self.in_sample_residuals_ = None
    self.n_features_in_       = None
    self.is_memory_reduced    = False
    self.is_fitted            = False
    self.best_params_         = None

    if not isinstance(y, (pd.Series, np.ndarray)):
        raise ValueError("`y` must be a pandas Series or numpy ndarray.")

    y = np.asarray(y, dtype=np.float64)
    if y.ndim == 2 and y.shape[1] == 1:
        # Allow (n, 1) shaped arrays and squeeze to 1D
        y = y.ravel()
    elif y.ndim != 1:
        raise ValueError("`y` must be a 1D array-like sequence.")
    if len(y) < 1:
        raise ValueError("`y` is too short to fit ETS model.")

    # Automatic model selection
    if self.model == "ZZZ":
        self.model_ = auto_ets(
            y,
            m                          = self.m,
            seasonal                   = self.seasonal,
            trend                      = self.trend,
            damped                     = self.damped,
            ic                         = self.ic,
            allow_multiplicative       = self.allow_multiplicative,
            allow_multiplicative_trend = self.allow_multiplicative_trend,
            lambda_auto                = self.lambda_auto,
            verbose                    = False,
        )

        self.best_params_ = {
            "m": self.model_.config.m,
            "model": f"{self.model_.config.error}{self.model_.config.trend}{self.model_.config.season}",
            "damped": self.model_.config.damped,
            "alpha": self.model_.params.alpha,
            "beta": self.model_.params.beta,
            "gamma": self.model_.params.gamma,
            "phi": self.model_.params.phi,
            "lambda_param": self.lambda_param,
            "lambda_auto": self.lambda_auto,
            "bias_adjust": self.bias_adjust,
            "bounds": self.bounds,
            "seasonal": self.seasonal,
            "trend": self.trend,
            "ic": self.ic,
            "allow_multiplicative": self.allow_multiplicative,
            "allow_multiplicative_trend": self.allow_multiplicative_trend,
        }

    else:
        # Fit specific model
        damped_param = False if self.damped is None else self.damped
        self.model_ = ets(
            y,
            m            = self.m,
            model        = self.model,
            damped       = damped_param,
            alpha        = self.alpha,
            beta         = self.beta,
            gamma        = self.gamma,
            phi          = self.phi,
            lambda_param = self.lambda_param,
            lambda_auto  = self.lambda_auto,
            bias_adjust  = self.bias_adjust,
            bounds       = self.bounds,
        )

    # Extract model attributes (use references to avoid duplicating arrays)
    self.model_config_        = asdict(self.model_.config)
    self.params_              = asdict(self.model_.params)
    self.aic_                 = self.model_.aic
    self.bic_                 = self.model_.bic
    self.y_train_             = self.model_.y_original
    self.fitted_values_       = self.model_.fitted
    self.in_sample_residuals_ = self.model_.residuals
    self.n_features_in_       = 1
    self.is_fitted            = True

    model_name = f"{self.model_config_['error']}{self.model_config_['trend']}{self.model_config_['season']}"
    if self.model_config_['damped'] and self.model_config_['trend'] != "N":
        model_name = f"{self.model_config_['error']}{self.model_config_['trend']}d{self.model_config_['season']}"

    self.estimator_name_ = f"Ets({model_name})"

    return self

predict ¶


predict(steps, exog=None)

Generate mean forecasts steps ahead.

Parameters:

Name	Type	Description	Default
`steps`	`int`	Forecast horizon (must be > 0).	required
`exog`	`None`	Exogenous variables. Ignored, present for API compatibility.	`None`

Returns:

Name	Type	Description
`predictions`	`ndarray of shape (steps,)`	Point forecasts for steps 1..h.

Source code in skforecast\stats\_ets.py

@check_is_fitted
def predict(self, steps: int, exog: None = None) -> np.ndarray:
    """
    Generate mean forecasts steps ahead.

    Parameters
    ----------
    steps : int
        Forecast horizon (must be > 0).
    exog : None
        Exogenous variables. Ignored, present for API compatibility.

    Returns
    -------
    predictions : ndarray of shape (steps,)
        Point forecasts for steps 1..h.

    """

    if not isinstance(steps, (int, np.integer)) or steps <= 0:
        raise ValueError("`steps` must be a positive integer.")

    predictions = forecast_ets(
        self.model_,
        h           = steps,
        bias_adjust = self.bias_adjust,
        level       = None
    )
    return predictions["mean"]

predict_interval ¶


predict_interval(
    steps=1, level=(80, 95), as_frame=True, exog=None
)

Forecast with prediction intervals.

Parameters:

Name	Type	Description	Default
`steps`	`int`	Forecast horizon.	`1`
`level`	`list or tuple of float`	Confidence levels in percent.	`(80, 95)`
`as_frame`	`bool`	If True, return a tidy DataFrame with columns 'mean', 'lower_', 'upper_' for each level L. If False, return a NumPy ndarray.	`True`
`exog`	`Ignored`	Exogenous variables. Ignored, present for API compatibility.	`None`

Returns:

Name	Type	Description
`predictions`	`numpy ndarray, pandas DataFrame`	If as_frame=True, pandas DataFrame with columns 'mean', 'lower_', 'upper_' for each level L. If as_frame=False, numpy ndarray.

Source code in skforecast\stats\_ets.py

@check_is_fitted
def predict_interval(
    self,
    steps: int = 1,
    level: list[float] | tuple[float, ...] = (80, 95),
    as_frame: bool = True,
    exog: Any = None,
) -> np.ndarray | pd.DataFrame:
    """
    Forecast with prediction intervals.

    Parameters
    ----------
    steps : int, default 1
        Forecast horizon.
    level : list or tuple of float, default (80, 95)
        Confidence levels in percent.
    as_frame : bool, default True
        If True, return a tidy DataFrame with columns 'mean', 'lower_<L>',
        'upper_<L>' for each level L. If False, return a NumPy ndarray.
    exog : Ignored
        Exogenous variables. Ignored, present for API compatibility.

    Returns
    -------
    predictions : numpy ndarray, pandas DataFrame
        If as_frame=True, pandas DataFrame with columns 'mean', 'lower_<L>',
        'upper_<L>' for each level L. If as_frame=False, numpy ndarray.

    """

    if not isinstance(steps, (int, np.integer)) or steps <= 0:
        raise ValueError("`steps` must be a positive integer.")

    raw_preds = forecast_ets(
        self.model_,
        h           = steps,
        bias_adjust = self.bias_adjust,
        level       = list(level)
    )

    levels = list(level) if level is not None else []
    n_levels = len(levels)
    mean = np.asarray(raw_preds["mean"])

    predictions = np.empty((steps, 1 + 2 * n_levels), dtype=float)
    predictions[:, 0] = mean
    for i, lv in enumerate(levels):
        lv_int = int(lv)
        lower_key = f"lower_{lv_int}"
        upper_key = f"upper_{lv_int}"
        lower_arr = np.asarray(raw_preds[lower_key])
        upper_arr = np.asarray(raw_preds[upper_key])
        predictions[:, 1 + 2 * i] = lower_arr
        predictions[:, 1 + 2 * i + 1] = upper_arr

    if as_frame:
        col_names = ["mean"]
        for level in levels:
            level = int(level)
            col_names.append(f"lower_{level}")
            col_names.append(f"upper_{level}")

        predictions = pd.DataFrame(
            predictions, columns=col_names, index=pd.RangeIndex(1, steps + 1, name="step")
        )

    return predictions

get_residuals ¶


get_residuals()

Get in-sample residuals (observed - fitted) from the ETS model.

Returns:

Name	Type	Description
`residuals`	`ndarray of shape (n_samples,)`

Source code in skforecast\stats\_ets.py

@check_is_fitted
def get_residuals(self) -> np.ndarray:
    """
    Get in-sample residuals (observed - fitted) from the ETS model.

    Returns
    -------
    residuals : ndarray of shape (n_samples,)

    """

    check_memory_reduced(self, method_name='get_residuals')
    return self.in_sample_residuals_

get_fitted_values ¶


get_fitted_values()

Get in-sample fitted values from the ETS model.

Returns:

Name	Type	Description
`fitted`	`ndarray of shape (n_samples,)`

Source code in skforecast\stats\_ets.py

@check_is_fitted
def get_fitted_values(self) -> np.ndarray:
    """
    Get in-sample fitted values from the ETS model.

    Returns
    -------
    fitted : ndarray of shape (n_samples,)

    """

    check_memory_reduced(self, method_name='get_fitted_values')
    return self.fitted_values_

get_score ¶


get_score(y=None)

R^2 using in-sample fitted values.

Parameters:

Name	Type	Description	Default
`y`	`Ignored`	Present for API compatibility.	`None`

Returns:

Name	Type	Description
`score`	`float`	Coefficient of determination.

Source code in skforecast\stats\_ets.py

@check_is_fitted
def get_score(self, y: Any = None) -> float:
    """
    R^2 using in-sample fitted values.

    Parameters
    ----------
    y : Ignored
        Present for API compatibility.

    Returns
    -------
    score : float
        Coefficient of determination.

    """

    check_memory_reduced(self, method_name='get_score')

    y = self.y_train_
    fitted = self.fitted_values_

    # Handle NaN values if any
    mask = ~(np.isnan(y) | np.isnan(fitted))
    if mask.sum() < 2:
        return float("nan")

    ss_res = np.sum((y[mask] - fitted[mask]) ** 2)
    ss_tot = np.sum((y[mask] - y[mask].mean()) ** 2) + np.finfo(float).eps

    return 1.0 - ss_res / ss_tot

get_params ¶


get_params(deep=True)

Get parameters for this estimator.

Parameters:

Name	Type	Description	Default
`deep`	`bool`	If True, will return the parameters for this estimator and contained subobjects that are estimators.	`True`

Returns:

Name	Type	Description
`params`	`dict`	Parameter names mapped to their values.

Source code in skforecast\stats\_ets.py

def get_params(self, deep: bool = True) -> dict:
    """
    Get parameters for this estimator.

    Parameters
    ----------
    deep : bool, default True
        If True, will return the parameters for this estimator and
        contained subobjects that are estimators.

    Returns
    -------
    params : dict
        Parameter names mapped to their values.

    """

    return {
        "m": self.m,
        "model": self.model,
        "damped": self.damped,
        "alpha": self.alpha,
        "beta": self.beta,
        "gamma": self.gamma,
        "phi": self.phi,
        "seasonal": self.seasonal,
        "trend": self.trend,
        "allow_multiplicative": self.allow_multiplicative,
        "allow_multiplicative_trend": self.allow_multiplicative_trend,
    }

get_feature_importances ¶


get_feature_importances()

Get feature importances for Eta model.

Source code in skforecast\stats\_ets.py

@check_is_fitted
def get_feature_importances(self) -> pd.DataFrame:
    """Get feature importances for Eta model."""
    features = ['alpha (level)']
    importances = [self.params_['alpha']]

    if self.model_config_['trend'] != 'N':
        features.append('beta (trend)')
        importances.append(self.params_['beta'])

    if self.model_config_['season'] != 'N':
        features.append('gamma (seasonal)')
        importances.append(self.params_['gamma'])

    if self.model_config_['damped']:
        features.append('phi (damping)')
        importances.append(self.params_['phi'])

    return pd.DataFrame({
        'feature': features,
        'importance': importances
    })

get_info_criteria ¶


get_info_criteria(criteria)

Get information criteria.

Parameters:

Name	Type	Description	Default
`criteria`	`str`	Information criterion to retrieve. Valid options are 'aic' and 'bic'.	required

Returns:

Name	Type	Description
`info_criteria`	`float`	Value of the requested information criterion.

Source code in skforecast\stats\_ets.py

@check_is_fitted
def get_info_criteria(self, criteria: str) -> float:
    """
    Get information criteria.

    Parameters
    ----------
    criteria : str
        Information criterion to retrieve. Valid options are 'aic' and 'bic'.
    Returns
    -------
    info_criteria : float
        Value of the requested information criterion.

    """
    if criteria not in {'aic', 'bic'}:
        raise ValueError(
            "Invalid value for `criteria`. Valid options are 'aic' and 'bic' "
            "for ETS model."
        )

    if criteria == 'aic':
        value = self.aic_
    elif criteria == 'bic':
        value = self.bic_

    return value

_set_params ¶


_set_params(**params)

Set the parameters of this estimator. Internal method without resetting the fitted state. This method is intended for internal use only, please use set_params() instead.

Parameters:

Name	Type	Description	Default
`**params`	`dict`	Estimator parameters.	`{}`

Returns:

Type	Description
`None`

Source code in skforecast\stats\_ets.py

def _set_params(self, **params) -> None:
    """
    Set the parameters of this estimator. Internal method without resetting 
    the fitted state. This method is intended for internal use only, please 
    use `set_params()` instead.

    Parameters
    ----------
    **params : dict
        Estimator parameters.

    Returns
    -------
    None

    """

    for key, value in params.items():
        setattr(self, key, value)

    self.is_auto = self.model is None or self.model == "ZZZ"
    if self.is_auto:
        self.model = "ZZZ"
        estimator_name_ = "AutoEts()"
    else:
        estimator_name_ = f"Ets({self.model})"

    self.estimator_name_ = estimator_name_

set_params ¶


set_params(**params)

Set the parameters of this estimator and reset the fitted state.

This method resets the estimator to its unfitted state whenever parameters are changed, requiring the model to be refitted before making predictions.

Parameters:

Name	Type	Description	Default
`**params`	`dict`	Estimator parameters. Valid parameter keys are: 'm', 'model', 'damped', 'alpha', 'beta', 'gamma', 'phi', 'lambda_param', 'lambda_auto', 'bias_adjust', 'bounds', 'seasonal', 'trend', 'ic', 'allow_multiplicative', 'allow_multiplicative_trend'.	`{}`

Returns:

Name	Type	Description
`self`	`Ets`	The estimator with updated parameters and reset state.

Raises:

Type	Description
`ValueError`	If any parameter key is invalid.

Source code in skforecast\stats\_ets.py

def set_params(self, **params) -> Ets:
    """
    Set the parameters of this estimator and reset the fitted state.

    This method resets the estimator to its unfitted state whenever parameters
    are changed, requiring the model to be refitted before making predictions.

    Parameters
    ----------
    **params : dict
        Estimator parameters. Valid parameter keys are: 'm', 'model', 'damped',
        'alpha', 'beta', 'gamma', 'phi', 'lambda_param', 'lambda_auto',
        'bias_adjust', 'bounds', 'seasonal', 'trend', 'ic', 'allow_multiplicative',
        'allow_multiplicative_trend'.

    Returns
    -------
    self : Ets
        The estimator with updated parameters and reset state.

    Raises
    ------
    ValueError
        If any parameter key is invalid.

    """

    valid_params = {
        'm', 'model', 'damped', 'alpha', 'beta', 'gamma', 'phi',
        'lambda_param', 'lambda_auto', 'bias_adjust', 'bounds',
        'seasonal', 'trend', 'ic', 'allow_multiplicative',
        'allow_multiplicative_trend'
    }
    for key in params.keys():
        if key not in valid_params:
            raise ValueError(
                f"Invalid parameter '{key}' for estimator {self.__class__.__name__}. "
                f"Valid parameters are: {sorted(valid_params)}"
            )

    self._set_params(**params)

    # Reset fitted state - model needs to be refitted with new parameters
    self.model_               = None
    self.model_config_        = None
    self.params_              = None
    self.y_train_             = None
    self.fitted_values_       = None
    self.in_sample_residuals_ = None
    self.n_features_in_       = None
    self.is_memory_reduced    = False
    self.is_fitted            = False
    self.best_params_         = None

    return self

summary ¶


summary()

Print a summary of the fitted ETS model.

Source code in skforecast\stats\_ets.py

@check_is_fitted
def summary(self) -> None:
    """
    Print a summary of the fitted ETS model.
    """

    print("ETS Model Summary")
    print("=" * 60)
    print(f"Model: {self.estimator_name_}")
    print(f"Seasonal period (m): {self.model_config_['m']}")
    print()

    print("Smoothing parameters:")
    print(f"  alpha (level):       {self.params_['alpha']:.4f}")
    if self.model_config_['trend'] != "N":
        print(f"  beta (trend):        {self.params_['beta']:.4f}")
    if self.model_config_['season'] != "N":
        print(f"  gamma (seasonal):    {self.params_['gamma']:.4f}")
    if self.model_config_['damped']:
        print(f"  phi (damping):       {self.params_['phi']:.4f}")
    print()

    print("Initial states:")
    print(f"  Level (l0):          {self.params_['init_states'][0]:.4f}")
    if self.model_config_['trend'] != "N" and len(self.params_['init_states']) > 1:
        print(f"  Trend (b0):          {self.params_['init_states'][1]:.4f}")
    print()

    print("Model fit statistics:")
    print(f"  sigma^2:             {self.model_.sigma2:.6f}")
    print(f"  Log-likelihood:      {self.model_.loglik:.2f}")
    print(f"  AIC:                 {self.aic_:.2f}")
    print(f"  BIC:                 {self.bic_:.2f}")
    print()

    if not self.is_memory_reduced:
        print("Residual statistics:")
        print(f"  Mean:                {np.mean(self.in_sample_residuals_):.6f}")
        print(f"  Std Dev:             {np.std(self.in_sample_residuals_, ddof=1):.6f}")
        print(f"  MAE:                 {np.mean(np.abs(self.in_sample_residuals_)):.6f}")
        print(f"  RMSE:                {np.sqrt(np.mean(self.in_sample_residuals_**2)):.6f}")
        print()

        print("Time Series Summary Statistics:")
        print(f"Number of observations: {len(self.y_train_)}")
        print(f"  Mean:                 {np.mean(self.y_train_):.4f}")
        print(f"  Std Dev:              {np.std(self.y_train_, ddof=1):.4f}")
        print(f"  Min:                  {np.min(self.y_train_):.4f}")
        print(f"  25%:                  {np.percentile(self.y_train_, 25):.4f}")
        print(f"  Median:               {np.median(self.y_train_):.4f}")
        print(f"  75%:                  {np.percentile(self.y_train_, 75):.4f}")
        print(f"  Max:                  {np.max(self.y_train_):.4f}")

reduce_memory ¶


reduce_memory()

Reduce memory usage by removing internal arrays not needed for prediction. This method clears memory-heavy arrays that are only needed for diagnostics but not for prediction. After calling this method, the following methods will raise an error:

fitted_(): In-sample fitted values
residuals_(): In-sample residuals
score(): R² coefficient
summary(): Model summary statistics

Prediction methods remain fully functional:

predict(): Point forecasts
predict_interval(): Prediction intervals

Returns:

Name	Type	Description
`self`	`Ets`	The estimator with reduced memory usage.

Source code in skforecast\stats\_ets.py

@check_is_fitted
def reduce_memory(self) -> Ets:
    """
    Reduce memory usage by removing internal arrays not needed for prediction.
    This method clears memory-heavy arrays that are only needed for diagnostics
    but not for prediction. After calling this method, the following methods
    will raise an error:

    - fitted_(): In-sample fitted values
    - residuals_(): In-sample residuals
    - score(): R² coefficient
    - summary(): Model summary statistics

    Prediction methods remain fully functional:

    - predict(): Point forecasts
    - predict_interval(): Prediction intervals

    Returns
    -------
    self : Ets
        The estimator with reduced memory usage.

    """

    # Clear arrays at Ets level
    self.y_train_ = None
    self.fitted_values_ = None
    self.in_sample_residuals_ = None

    # Clear arrays at ETSModel level
    if hasattr(self, 'model_'):
        self.model_.fitted = None
        self.model_.residuals = None
        self.model_.y_original = None

    self.is_memory_reduced = True

    return self

skforecast.stats._arar.Arar ¶


Arar(max_ar_depth=None, max_lag=None, safe=True)

Bases: BaseEstimator, RegressorMixin

Scikit-learn style wrapper for the ARAR time-series model.

This estimator treats a univariate sequence as "the feature". Call fit(y) with a 1D array-like of observations in time order, then produce out-of-sample forecasts via predict(steps) and prediction intervals via predict_interval(steps, level=...). In-sample diagnostics are available through fitted_, residuals_() and summary().

Parameters:

Name	Type	Description	Default
`max_ar_depth`	`int`	Maximum AR depth considered for the (1, i, j, k) AR selection stage.	`None`
`max_lag`	`int`	Maximum lag used when estimating autocovariances.	`None`
`safe`	`bool`	If True, falls back to a mean-only model on numerical issues or very short series; otherwise errors are raised.	`True`

Attributes:

Name	Type	Description
`max_ar_depth`	`int or None`	Maximum AR depth considered for the (1, i, j, k) AR selection stage during model fitting. When None, a default value is determined automatically based on the series length.
`max_lag`	`int or None`	Maximum lag used when estimating autocovariances during the memory-shortening step. When None, a default value is determined automatically based on the series length.
`safe`	`bool`	Whether to use safe mode. When True, the model falls back to a mean-only forecast on numerical issues or very short series. When False, errors are raised instead.
`model_`	`tuple or None`	Raw tuple returned by the underlying ARAR algorithm containing: (Y, best_phi, best_lag, sigma2, psi, sbar, max_ar_depth, max_lag). Available after calling `fit()`.
`coef_`	`ndarray of shape (4,) or None`	Estimated AR coefficients for the selected lags (1, i, j, k). Some coefficients may be zero if the corresponding lag was not selected. Available after calling `fit()`.
`lags_`	`tuple or None`	Selected lag indices (1, i, j, k) used in the AR model, where each represents which past observations contribute to the forecast. Available after calling `fit()`.
`sigma2_`	`float or None`	Estimated innovation variance (one-step-ahead forecast error variance) from the fitted ARAR model. Available after calling `fit()`.
`psi_`	`ndarray or None`	Memory-shortening filter coefficients used to transform the original series into one with shorter memory before AR fitting. Available after calling `fit()`.
`sbar_`	`float or None`	Mean of the memory-shortened series, used as the long-run mean in forecasting. Available after calling `fit()`.
`aic_`	`float or None`	Akaike Information Criterion measuring model fit quality while penalizing complexity. For models with exogenous variables, this is an approximate calculation that treats the two-step procedure (regression + ARAR) as independent stages, which may underestimate total model complexity. Available after calling `fit()`.
`bic_`	`float or None`	Bayesian Information Criterion, similar to AIC but with a stronger penalty for model complexity. For models with exogenous variables, this is an approximate calculation that treats the two-step procedure (regression + ARAR) as independent stages, which may underestimate total model complexity. Available after calling `fit()`.
`exog_model_`	`FastLinearRegression or None`	Fitted linear regression model for exogenous variables. When exogenous variables are provided during fitting, this model captures their linear relationship with the target series. Available after calling `fit()` with exogenous variables.
`coef_exog_`	`ndarray of shape (n_exog_features,) or None`	Coefficients from the exogenous variables regression model, excluding the intercept. Available after calling `fit()` with exogenous variables.
`n_exog_features_in_`	`int or None`	Number of exogenous features used during fitting. Zero if no exogenous variables were provided. Available after calling `fit()`.
`y_train_`	`ndarray of shape (n_samples,) or None`	Original training time series used to fit the model.
`fitted_values_`	`ndarray of shape (n_samples,) or None`	One-step-ahead in-sample fitted values. The first k-1 values may be NaN where k is the largest lag used.
`in_sample_residuals_`	`ndarray of shape (n_samples,) or None`	In-sample residuals calculated as the difference between observed values and fitted values.
`n_features_in_`	`int or None`	Number of features (time series) seen during `fit()`. For ARAR, this is always 1 as it handles univariate time series (present for scikit-learn compatibility). Available after calling `fit()`.
`is_memory_reduced`	`bool`	Flag indicating whether `reduce_memory()` has been called to clear diagnostic arrays (y_train_, fitted_values_, in_sample_residuals_).
`is_fitted`	`bool`	Flag indicating whether the model has been successfully fitted to data.
`estimator_name_`	`str`	String identifier of the fitted model configuration (e.g., "Arar(lags=[1,2,3])"). This is updated after fitting to reflect the selected model.

Notes

When exogenous variables are provided during fitting, the model uses a two-step approach (regression followed by ARAR on residuals). In this approach, the target series is first regressed on the exogenous variables using a linear regression model. The residuals from this regression, representing the portion of the series not explained by the exogenous variables, are then modeled using the ARAR model.

This design allows the influence of exogenous variables to be incorporated prior to applying the ARAR model, rather than within the ARAR dynamics themselves.

This two-step approach is necessary because the ARAR model is inherently univariate and does not natively support exogenous variables. By separating the regression step, the method preserves the original ARAR formulation while still capturing the effects of external predictors.

However, this approach carries important assumptions and implications:

The relationship between the target series and the exogenous variables is assumed to be linear and time-invariant.
The ARAR model is applied only to the residual process, meaning its parameters describe the dynamics of the series after removing the contribution of exogenous variables.
As a result, the interpretability of the ARAR parameters changes: they no longer describe the full data-generating process, but rather the behavior of the unexplained component.

Despite these limitations, this strategy provides a practical and computationally efficient way to incorporate exogenous information into an otherwise univariate ARAR framework.

Methods:

Name	Description
`fit`	Fit the ARAR model to a univariate time series.
`predict`	Generate mean forecasts steps ahead.
`predict_interval`	Forecast with symmetric normal-theory prediction intervals.
`get_residuals`	Get in-sample residuals (observed - fitted) from the ARAR model.
`get_fitted_values`	Get in-sample fitted values from the ARAR model.
`get_score`	R^2 using in-sample fitted values (ignores initial NaNs).
`get_params`	Get parameters for this estimator.
`get_feature_importances`	Get feature importances for Arar model.
`get_info_criteria`	Get information criteria.
`set_params`	Set the parameters of this estimator and reset the fitted state.
`summary`	Print a simple textual summary of the fitted Arar model.
`reduce_memory`	Reduce memory usage by removing internal arrays not needed for prediction.

Source code in skforecast\stats\_arar.py

def __init__(
    self, 
    max_ar_depth: int | None = None, 
    max_lag: int | None = None, 
    safe: bool = True
):
    self.max_ar_depth           = max_ar_depth
    self.max_lag                = max_lag
    self.safe                   = safe
    self.lags_                  = None
    self.sigma2_                = None
    self.psi_                   = None
    self.sbar_                  = None

    self.model_                 = None
    self.coef_                  = None
    self.aic_                   = None
    self.bic_                   = None
    self.exog_model_            = None
    self.coef_exog_             = None
    self.n_exog_features_in_    = None
    self.y_train_               = None
    self.fitted_values_         = None
    self.in_sample_residuals_   = None
    self.n_features_in_         = None
    self.is_memory_reduced      = False
    self.is_fitted              = False
    self.estimator_name_        = "Arar()"

max_ar_depth `instance-attribute` ¶


max_ar_depth = max_ar_depth

max_lag `instance-attribute` ¶


max_lag = max_lag

safe `instance-attribute` ¶


safe = safe

lags_ `instance-attribute` ¶


lags_ = None

sigma2_ `instance-attribute` ¶


sigma2_ = None

psi_ `instance-attribute` ¶


psi_ = None

sbar_ `instance-attribute` ¶


sbar_ = None

model_ `instance-attribute` ¶


model_ = None

coef_ `instance-attribute` ¶


coef_ = None

aic_ `instance-attribute` ¶


aic_ = None

bic_ `instance-attribute` ¶


bic_ = None

exog_model_ `instance-attribute` ¶


exog_model_ = None

coef_exog_ `instance-attribute` ¶


coef_exog_ = None

n_exog_features_in_ `instance-attribute` ¶


n_exog_features_in_ = None

y_train_ `instance-attribute` ¶


y_train_ = None

fitted_values_ `instance-attribute` ¶


fitted_values_ = None

in_sample_residuals_ `instance-attribute` ¶


in_sample_residuals_ = None

n_features_in_ `instance-attribute` ¶


n_features_in_ = None

is_memory_reduced `instance-attribute` ¶


is_memory_reduced = False

is_fitted `instance-attribute` ¶


is_fitted = False

estimator_name_ `instance-attribute` ¶


estimator_name_ = 'Arar()'

fit ¶


fit(y, exog=None, suppress_warnings=False)

Fit the ARAR model to a univariate time series.

Parameters:

Name	Type	Description	Default
`y`	`array-like of shape (n_samples,)`	Time-ordered numeric sequence.	required
`exog`	`Series, DataFrame, or ndarray of shape (n_samples, n_exog_features)`	Exogenous variables to include in the model. See Notes section for details on how exogenous variables are handled.	`None`
`suppress_warnings`	`bool`	If True, suppresses the warning about exogenous variables affecting model interpretation.	`False`

Returns:

Name	Type	Description
`self`	`Arar`	Fitted estimator.

Notes

When exogenous variables are provided during fitting, the model uses a two-step approach (regression followed by ARAR on residuals). In this approach, the target series is first regressed on the exogenous variables using a linear regression model. The residuals from this regression, representing the portion of the series not explained by the exogenous variables, are then modeled using the ARAR model.

This design allows the influence of exogenous variables to be incorporated prior to applying the ARAR model, rather than within the ARAR dynamics themselves.

This two-step approach is necessary because the ARAR model is inherently univariate and does not natively support exogenous variables. By separating the regression step, the method preserves the original ARAR formulation while still capturing the effects of external predictors.

However, this approach carries important assumptions and implications:

The relationship between the target series and the exogenous variables is assumed to be linear and time-invariant.
The ARAR model is applied only to the residual process, meaning its parameters describe the dynamics of the series after removing the contribution of exogenous variables.
As a result, the interpretability of the ARAR parameters changes: they no longer describe the full data-generating process, but rather the behavior of the unexplained component.

Despite these limitations, this strategy provides a practical and computationally efficient way to incorporate exogenous information into an otherwise univariate ARAR framework.

Source code in skforecast\stats\_arar.py

def fit(
    self, 
    y: np.ndarray | pd.Series, 
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None,
    suppress_warnings: bool = False
) -> "Arar":
    """
    Fit the ARAR model to a univariate time series.

    Parameters
    ----------
    y : array-like of shape (n_samples,)
        Time-ordered numeric sequence.
    exog : Series, DataFrame, or ndarray of shape (n_samples, n_exog_features), default None
        Exogenous variables to include in the model. See Notes section for details
        on how exogenous variables are handled.
    suppress_warnings : bool, default False
        If True, suppresses the warning about exogenous variables affecting model
        interpretation.

    Returns
    -------
    self : Arar
        Fitted estimator.

    Notes
    -----
    When exogenous variables are provided during fitting, the model uses a
    two-step approach (regression followed by ARAR on residuals). In this
    approach, the target series is first regressed on the exogenous variables
    using a linear regression model. The residuals from this regression,
    representing the portion of the series not explained by the exogenous
    variables, are then modeled using the ARAR model.

    This design allows the influence of exogenous variables to be incorporated
    prior to applying the ARAR model, rather than within the ARAR dynamics
    themselves.

    This two-step approach is necessary because the ARAR model is inherently
    univariate and does not natively support exogenous variables. By separating
    the regression step, the method preserves the original ARAR formulation
    while still capturing the effects of external predictors.

    However, this approach carries important assumptions and implications:

    - The relationship between the target series and the exogenous variables is
    assumed to be linear and time-invariant.
    - The ARAR model is applied only to the residual process, meaning its
    parameters describe the dynamics of the series after removing the
    contribution of exogenous variables.
    - As a result, the interpretability of the ARAR parameters changes: they no
    longer describe the full data-generating process, but rather the behavior
    of the unexplained component.

    Despite these limitations, this strategy provides a practical and
    computationally efficient way to incorporate exogenous information into an
    otherwise univariate ARAR framework.

    """

    self.lags_                = None
    self.sigma2_              = None
    self.psi_                 = None
    self.sbar_                = None

    self.model_               = None
    self.coef_                = None
    self.aic_                 = None
    self.bic_                 = None
    self.exog_model_          = None
    self.coef_exog_           = None
    self.n_exog_features_in_  = None
    self.y_train_             = None
    self.fitted_values_       = None
    self.in_sample_residuals_ = None
    self.n_features_in_       = None
    self.is_memory_reduced    = False
    self.is_fitted            = False

    if not isinstance(y, (pd.Series, np.ndarray)):
        raise TypeError("`y` must be a pandas Series or numpy ndarray.")

    if not isinstance(exog, (type(None), pd.Series, pd.DataFrame, np.ndarray)):
        raise TypeError("`exog` must be None, a pandas Series, pandas DataFrame, or numpy ndarray.")

    y = np.asarray(y, dtype=float)
    if y.ndim == 2 and y.shape[1] == 1:
        y = y.ravel()
    elif y.ndim != 1:
        raise ValueError("`y` must be a 1D array-like sequence.")

    series_to_arar = y

    if exog is not None:
        if not suppress_warnings:
            warnings.warn(
                "Exogenous variables are being handled using a two-step approach: "
                "(1) linear regression on exog, (2) ARAR on residuals. "
                "This affects model interpretation:\n"
                "  - ARAR coefficients (coef_) describe residual dynamics, not the original series\n"
                "  - Pred intervals reflect only ARAR uncertainty, not exog regression uncertainty\n"
                "  - Assumes a linear, time-invariant relationship between exog and target\n"
                "For more details, see the fit() method's Notes section of ARAR class. ",
                ExogenousInterpretationWarning
            )

        exog = np.asarray(exog, dtype=float)
        if exog.ndim == 1:
            exog = exog.reshape(-1, 1)
        elif exog.ndim != 2:
            raise ValueError("`exog` must be 1D or 2D.")

        if len(exog) != len(y):
            raise ValueError(f"Length of exog ({len(exog)}) must match length of y ({len(y)})")

        self.exog_model_ = FastLinearRegression()
        self.exog_model_.fit(exog, y)
        self.coef_exog_ = self.exog_model_.coef_
        series_to_arar = y - self.exog_model_.predict(exog)

    if series_to_arar.size < 2 and not self.safe:
        raise ValueError("Series too short to fit ARAR when safe=False.")

    self.model_ = arar(
        series_to_arar, max_ar_depth=self.max_ar_depth, max_lag=self.max_lag, safe=self.safe
    )

    (Y, best_phi, best_lag, sigma2, psi, sbar, max_ar_depth, max_lag) = self.model_

    self.max_ar_depth        = max_ar_depth
    self.max_lag             = max_lag
    self.lags_               = tuple(best_lag)
    self.sigma2_             = float(sigma2)
    self.psi_                = np.asarray(psi, dtype=float)
    self.sbar_               = float(sbar)
    self.coef_               = np.asarray(best_phi, dtype=float)
    self.y_train_            = y
    self.n_exog_features_in_ = exog.shape[1] if exog is not None else 0
    self.n_features_in_      = 1       
    self.is_memory_reduced   = False
    self.is_fitted           = True

    arar_fitted = fitted_arar(self.model_)["fitted"]
    if self.exog_model_ is not None:
        exog_fitted = self.exog_model_.predict(exog)
        self.fitted_values_ = exog_fitted + arar_fitted
    else:
        self.fitted_values_ = arar_fitted

    # Residuals: original y minus fitted values
    self.in_sample_residuals_ = y - self.fitted_values_

    # Compute AIC and BIC
    # Note: For models with exogenous variables, this is an approximate calculation
    # that treats the two-step procedure (regression + ARAR) as independent stages.
    # This may underestimate model complexity. Use these criteria primarily for
    # comparing models with the same exogenous structure.
    largest_lag = max(self.lags_)
    valid_residuals = self.in_sample_residuals_[largest_lag:]
    # Remove NaN values for AIC/BIC calculation
    valid_residuals = valid_residuals[~np.isnan(valid_residuals)]
    n = len(valid_residuals)
    if n > 0:
        # Count parameters:
        # - ARAR: 4 AR coefficients + 1 mean parameter (sbar) + 1 variance (sigma2) = 6
        # - Exog: n_exog coefficients + 1 intercept (if exog present)
        # Note: We count all 4 AR coefficients even if some are zero, as they were
        # selected during model fitting. The variance parameter sigma2 is also estimated.
        k_arar = 6  # 4 AR coefficients + sbar + sigma2
        k_exog = (self.n_exog_features_in_ + 1) if self.exog_model_ is not None else 0  # +1 for intercept
        k = k_arar + k_exog
        sigma2 = max(np.sum(valid_residuals ** 2) / n, 1e-12)  # Ensure positive
        loglik = -0.5 * n * (np.log(2 * np.pi) + np.log(sigma2) + 1)
        self.aic_ = -2 * loglik + 2 * k
        self.bic_ = -2 * loglik + k * np.log(n)
    else:
        self.aic_ = np.nan
        self.bic_ = np.nan

    self.estimator_name_ = f"Arar(lags={self.lags_})"

    return self

predict ¶


predict(steps, exog=None)

Generate mean forecasts steps ahead.

Parameters:

Name	Type	Description	Default
`steps`	`int`	Forecast horizon (must be > 0)	required
`exog`	`ndarray, Series or DataFrame of shape (steps, n_exog_features)`	Exogenous variables for prediction.	`None`

Returns:

Name	Type	Description
`predictions`	`ndarray of shape (h,)`	Point forecasts for steps 1..h.

Source code in skforecast\stats\_arar.py

@check_is_fitted
def predict(
    self, 
    steps: int, 
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None
) -> np.ndarray:
    """
    Generate mean forecasts steps ahead.

    Parameters
    ----------
    steps : int
        Forecast horizon (must be > 0)
    exog : ndarray, Series or DataFrame of shape (steps, n_exog_features), default None
        Exogenous variables for prediction.

    Returns
    -------
    predictions : ndarray of shape (h,)
        Point forecasts for steps 1..h.

    """

    if not isinstance(steps, (int, np.integer)) or steps <= 0:
        raise ValueError("`steps` must be a positive integer.")

    # Forecast ARAR component
    predictions = forecast(self.model_, h=steps)["mean"]

    if self.exog_model_ is None and exog is not None:
        raise ValueError(
            "Model was fitted without exog, but `exog` was provided for prediction. "
            "Please refit the model with exogenous variables."
        )

    if self.exog_model_ is not None:
        if exog is None:
            raise ValueError("Model was fitted with exog, so `exog` is required for prediction.")
        exog = np.asarray(exog, dtype=float)
        if exog.ndim == 1:
            exog = exog.reshape(-1, 1)
        elif exog.ndim != 2:
            raise ValueError("`exog` must be 1D or 2D.")

        # Check feature consistency
        if exog.shape[1] != self.n_exog_features_in_:
            raise ValueError(f"Mismatch in exogenous features: fitted with {self.n_exog_features_in_}, got {exog.shape[1]}.")

        if len(exog) != steps:
            raise ValueError(f"Length of exog ({len(exog)}) must match steps ({steps}).")

        # Forecast Regression component
        exog_pred = self.exog_model_.predict(exog)
        predictions = predictions + exog_pred

    return predictions

predict_interval ¶


predict_interval(
    steps=1, level=(80, 95), as_frame=True, exog=None
)

Forecast with symmetric normal-theory prediction intervals.

Parameters:

Name	Type	Description	Default
`steps`	`int`	Forecast horizon.	`1`
`level`	`iterable of int`	Confidence levels in percent.	`(80, 95)`
`as_frame`	`bool`	If True, return a tidy DataFrame with columns 'mean', 'lower_', 'upper_' for each level L. If False, return a NumPy ndarray.	`True`
`exog`	`ndarray, Series or DataFrame of shape (steps, n_exog_features)`	Exogenous variables for prediction.	`None`

Returns:

Name	Type	Description
`predictions`	`numpy ndarray, pandas DataFrame`	If as_frame=True, pandas DataFrame with columns 'mean', 'lower_', 'upper_' for each level L. If as_frame=False, numpy ndarray.

Notes

When exogenous variables are used, prediction intervals account only for ARAR forecast uncertainty and do not include uncertainty from the regression coefficients. This may result in undercoverage (actual coverage < nominal level).

Source code in skforecast\stats\_arar.py

@check_is_fitted
def predict_interval(
    self,
    steps: int = 1,
    level=(80, 95),
    as_frame: bool = True,
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None
) -> np.ndarray | pd.DataFrame:
    """
    Forecast with symmetric normal-theory prediction intervals.

    Parameters
    ----------
    steps : int, default 1
        Forecast horizon.
    level : iterable of int, default (80, 95)
        Confidence levels in percent.
    as_frame : bool, default True
        If True, return a tidy DataFrame with columns 'mean', 'lower_<L>',
        'upper_<L>' for each level L. If False, return a NumPy ndarray.
    exog : ndarray, Series or DataFrame of shape (steps, n_exog_features), default None
        Exogenous variables for prediction.

    Returns
    -------
    predictions : numpy ndarray, pandas DataFrame
        If as_frame=True, pandas DataFrame with columns 'mean', 'lower_<L>',
        'upper_<L>' for each level L. If as_frame=False, numpy ndarray.

    Notes
    -----
    When exogenous variables are used, prediction intervals account only for 
    ARAR forecast uncertainty and do not include uncertainty from the regression 
    coefficients. This may result in **undercoverage** (actual coverage < nominal level).

    """

    if not isinstance(steps, (int, np.integer)) or steps <= 0:
        raise ValueError("`steps` must be a positive integer.")

    raw_preds = forecast(self.model_, h=steps, level=level)

    if self.exog_model_ is None and exog is not None:
        raise ValueError(
            "Model was fitted without exog, but `exog` was provided for prediction. "
            "Please refit the model with exogenous variables."
        )

    if self.exog_model_ is not None:
        if exog is None:
            raise ValueError("Model was fitted with exog, so `exog` is required for prediction.")
        exog = np.asarray(exog, dtype=float)
        if exog.ndim == 1:
            exog = exog.reshape(-1, 1)
        elif exog.ndim != 2:
            raise ValueError("`exog` must be 1D or 2D.")

        # Check feature consistency
        if exog.shape[1] != self.n_exog_features_in_:
            raise ValueError(
                f"Mismatch in exogenous features: fitted with {self.n_exog_features_in_}, "
                f"got {exog.shape[1]}.")

        if len(exog) != steps:
            raise ValueError(f"Length of exog ({len(exog)}) must match steps ({steps}).")

        exog_pred = self.exog_model_.predict(exog)

        raw_preds["mean"] = raw_preds["mean"] + exog_pred
        # Broadcast the exog prediction across confidence columns
        raw_preds["upper"] = raw_preds["upper"] + exog_pred[:, np.newaxis]
        raw_preds["lower"] = raw_preds["lower"] + exog_pred[:, np.newaxis]

    levels = raw_preds["level"]
    n_levels = len(levels)
    cols = [raw_preds["mean"]]
    for i in range(n_levels):
        cols.append(raw_preds["lower"][:, i])
        cols.append(raw_preds["upper"][:, i])

    predictions = np.column_stack(cols)

    if as_frame:
        col_names = ["mean"]
        for level in levels:
            level = int(level)
            col_names.append(f"lower_{level}")
            col_names.append(f"upper_{level}")

        predictions = pd.DataFrame(
            predictions, columns=col_names, index=pd.RangeIndex(1, steps + 1, name="step")
        )

    return predictions

get_residuals ¶


get_residuals()

Get in-sample residuals (observed - fitted) from the ARAR model.

Returns:

Name	Type	Description
`residuals`	`ndarray of shape (n_samples,)`

Source code in skforecast\stats\_arar.py

@check_is_fitted
def get_residuals(self) -> np.ndarray:
    """
    Get in-sample residuals (observed - fitted) from the ARAR model.

    Returns
    -------
    residuals : ndarray of shape (n_samples,)

    """

    check_memory_reduced(self, method_name='get_residuals')
    return self.in_sample_residuals_

get_fitted_values ¶


get_fitted_values()

Get in-sample fitted values from the ARAR model.

Returns:

Name	Type	Description
`fitted`	`ndarray of shape (n_samples,)`

Source code in skforecast\stats\_arar.py

@check_is_fitted
def get_fitted_values(self) -> np.ndarray:
    """
    Get in-sample fitted values from the ARAR model.

    Returns
    -------
    fitted : ndarray of shape (n_samples,)

    """

    check_memory_reduced(self, method_name='get_fitted_values')
    return self.fitted_values_

get_score ¶


get_score(y=None)

R^2 using in-sample fitted values (ignores initial NaNs).

Parameters:

Name	Type	Description	Default
`y`	`ignored`	Present for API compatibility.	`None`

Returns:

Name	Type	Description
`score`	`float`	Coefficient of determination.

Source code in skforecast\stats\_arar.py

@check_is_fitted
def get_score(self, y: Any = None) -> float:
    """
    R^2 using in-sample fitted values (ignores initial NaNs).

    Parameters
    ----------
    y : ignored
        Present for API compatibility.

    Returns
    -------
    score : float
        Coefficient of determination.

    """

    check_memory_reduced(self, method_name='get_score')

    y = self.y_train_
    fitted = self.fitted_values_

    mask = ~np.isnan(fitted)
    if mask.sum() < 2:
        return float("nan")
    ss_res = np.sum((y[mask] - fitted[mask]) ** 2)
    ss_tot = np.sum((y[mask] - y[mask].mean()) ** 2) + np.finfo(float).eps

    return 1.0 - ss_res / ss_tot

get_params ¶


get_params(deep=True)

Get parameters for this estimator.

Parameters:

Name	Type	Description	Default
`deep`	`bool`	If True, will return the parameters for this estimator and contained subobjects that are estimators.	`True`

Returns:

Name	Type	Description
`params`	`dict`	Parameter names mapped to their values.

Source code in skforecast\stats\_arar.py

def get_params(self, deep: bool = True) -> dict:
    """
    Get parameters for this estimator.

    Parameters
    ----------
    deep : bool, default True
        If True, will return the parameters for this estimator and
        contained subobjects that are estimators.

    Returns
    -------
    params : dict
        Parameter names mapped to their values.

    """

    return {
        "max_ar_depth": self.max_ar_depth,
        "max_lag": self.max_lag,
        "safe": self.safe
    }

get_feature_importances ¶


get_feature_importances()

Get feature importances for Arar model.

Source code in skforecast\stats\_arar.py

@check_is_fitted
def get_feature_importances(self) -> pd.DataFrame:
    """Get feature importances for Arar model."""
    importances = pd.DataFrame({
        'feature': [f'lag_{lag}' for lag in self.lags_],
        'importance': self.coef_
    })

    if self.coef_exog_ is not None:
        exog_importances = pd.DataFrame({
            'feature': [f'exog_{i}' for i in range(self.coef_exog_.shape[0])],
            'importance': self.coef_exog_
        })
        importances = pd.concat([importances, exog_importances], ignore_index=True)
        warnings.warn(
                "Exogenous variables are being handled using a two-step approach: "
                "(1) linear regression on exog, (2) ARAR on residuals. "
                "This affects model interpretation:\n"
                "  - ARAR coefficients (coef_) describe residual dynamics, not the original series\n"
                "  - Exogenous coefficients (coef_exog_) describe exogenous impact on original series",
            ExogenousInterpretationWarning
        )

    return importances

get_info_criteria ¶


get_info_criteria(criteria)

Get information criteria.

Parameters:

Name	Type	Description	Default
`criteria`	`str`	Information criterion to retrieve. Valid options are 'aic' and 'bic'.	required

Returns:

Name	Type	Description
`info_criteria`	`float`	Value of the requested information criterion.

Source code in skforecast\stats\_arar.py

@check_is_fitted
def get_info_criteria(self, criteria: str) -> float:
    """
    Get information criteria.

    Parameters
    ----------
    criteria : str
        Information criterion to retrieve. Valid options are 'aic' and 'bic'.
    Returns
    -------
    info_criteria : float
        Value of the requested information criterion.

    """
    if criteria not in {'aic', 'bic'}:
        raise ValueError(
            "Invalid value for `criteria`. Valid options are 'aic' and 'bic' "
            "for ARAR model."
        )

    if criteria == 'aic':
        value = self.aic_
    else:
        value = self.bic_

    return value

set_params ¶


set_params(**params)

Set the parameters of this estimator and reset the fitted state.

This method resets the estimator to its unfitted state whenever parameters are changed, requiring the model to be refitted before making predictions.

Parameters:

Name	Type	Description	Default
`**params`	`dict`	Estimator parameters. Valid parameter keys are 'max_ar_depth', 'max_lag', and 'safe'.	`{}`

Returns:

Type	Description
`Arar`	The estimator with updated parameters and reset state.

Source code in skforecast\stats\_arar.py

def set_params(self, **params) -> "Arar":
    """
    Set the parameters of this estimator and reset the fitted state.

    This method resets the estimator to its unfitted state whenever parameters
    are changed, requiring the model to be refitted before making predictions.

    Parameters
    ----------
    **params : dict
        Estimator parameters. Valid parameter keys are 'max_ar_depth', 'max_lag',
        and 'safe'.

    Returns
    -------
    Arar
        The estimator with updated parameters and reset state.

    """

    valid_params = {'max_ar_depth', 'max_lag', 'safe'}
    for key in params.keys():
        if key not in valid_params:
            raise ValueError(
                f"Invalid parameter '{key}' for estimator {self.__class__.__name__}. "
                f"Valid parameters are: {valid_params}"
            )

    for key, value in params.items():
        setattr(self, key, value)

    # Reset fitted state
    self.lags_                  = None
    self.sigma2_                = None
    self.psi_                   = None
    self.sbar_                  = None

    self.model_                 = None
    self.coef_                  = None
    self.aic_                   = None
    self.bic_                   = None
    self.exog_model_            = None
    self.coef_exog_             = None
    self.n_exog_features_in_    = None
    self.y_train_               = None
    self.fitted_values_         = None
    self.in_sample_residuals_   = None
    self.n_features_in_         = None
    self.is_memory_reduced      = False
    self.is_fitted              = False
    self.estimator_name_        = "Arar()"

    return self

summary ¶


summary()

Print a simple textual summary of the fitted Arar model.

Source code in skforecast\stats\_arar.py

@check_is_fitted
def summary(self) -> None:
    """
    Print a simple textual summary of the fitted Arar model.
    """

    print(f"{self.estimator_name_} Model Summary")
    print("------------------")
    print(f"Selected AR lags:                         {self.lags_}")
    print(f"AR coefficients (phi):                    {np.round(self.coef_, 4)}")
    print(f"Residual variance (sigma^2):              {self.sigma2_:.4f}")
    print(f"Mean of shortened series (sbar):          {self.sbar_:.4f}")
    print(f"Length of memory-shortening filter (psi): {len(self.psi_)}")

    if not self.is_memory_reduced:
        print("\nTime Series Summary Statistics")
        print(f"Number of observations: {len(self.y_train_)}")
        print(f"Mean:                   {np.mean(self.y_train_):.4f}")
        print(f"Std Dev:                {np.std(self.y_train_, ddof=1):.4f}")
        print(f"Min:                    {np.min(self.y_train_):.4f}")
        print(f"25%:                    {np.percentile(self.y_train_, 25):.4f}")
        print(f"Median:                 {np.median(self.y_train_):.4f}")
        print(f"75%:                    {np.percentile(self.y_train_, 75):.4f}")
        print(f"Max:                    {np.max(self.y_train_):.4f}")

    print("\nModel Diagnostics")
    print(f"AIC: {self.aic_:.4f}")
    print(f"BIC: {self.bic_:.4f}")

    if self.exog_model_ is not None:
        print("\nExogenous Model (Linear Regression)")
        print("-----------------------------------")
        print(f"Number of features: {self.n_exog_features_in_}")
        print(f"Intercept: {self.exog_model_.intercept_:.4f}")
        print(f"Coefficients: {np.round(self.exog_model_.coef_, 4)}")

reduce_memory ¶


reduce_memory()

Reduce memory usage by removing internal arrays not needed for prediction. This method clears memory-heavy arrays that are only needed for diagnostics but not for prediction. After calling this method, the following methods will raise an error:

fitted_(): In-sample fitted values
residuals_(): In-sample residuals
score(): R² coefficient
summary(): Model summary statistics

Prediction methods remain fully functional:

predict(): Point forecasts
predict_interval(): Prediction intervals

Returns:

Name	Type	Description
`self`	`Arar`	The estimator with reduced memory usage.

Source code in skforecast\stats\_arar.py

@check_is_fitted
def reduce_memory(self) -> "Arar":
    """
    Reduce memory usage by removing internal arrays not needed for prediction.
    This method clears memory-heavy arrays that are only needed for diagnostics
    but not for prediction. After calling this method, the following methods
    will raise an error:

    - fitted_(): In-sample fitted values
    - residuals_(): In-sample residuals
    - score(): R² coefficient
    - summary(): Model summary statistics

    Prediction methods remain fully functional:

    - predict(): Point forecasts
    - predict_interval(): Prediction intervals

    Returns
    -------
    self : Arar
        The estimator with reduced memory usage.

    """

    self.fitted_values_ = None
    self.in_sample_residuals_ = None

    self.is_memory_reduced = True

    return self

stats¶

skforecast.stats._arima.Arima ¶

order instance-attribute ¶

seasonal_order instance-attribute ¶

m instance-attribute ¶

include_mean instance-attribute ¶

transform_pars instance-attribute ¶

method instance-attribute ¶

n_cond instance-attribute ¶

SSinit instance-attribute ¶

optim_method instance-attribute ¶

optim_kwargs instance-attribute ¶

kappa instance-attribute ¶

max_p instance-attribute ¶

max_q instance-attribute ¶

max_P instance-attribute ¶

max_Q instance-attribute ¶

max_order instance-attribute ¶

max_d instance-attribute ¶

max_D instance-attribute ¶

start_p instance-attribute ¶

start_q instance-attribute ¶

start_P instance-attribute ¶

start_Q instance-attribute ¶

stationary instance-attribute ¶

seasonal instance-attribute ¶

ic instance-attribute ¶

stepwise instance-attribute ¶

nmodels instance-attribute ¶

trace instance-attribute ¶

approximation instance-attribute ¶

truncate instance-attribute ¶

test instance-attribute ¶

test_kwargs instance-attribute ¶

seasonal_test instance-attribute ¶

seasonal_test_kwargs instance-attribute ¶

allowdrift instance-attribute ¶

allowmean instance-attribute ¶

lambda_bc instance-attribute ¶

biasadj instance-attribute ¶

is_auto instance-attribute ¶

model_ instance-attribute ¶

y_train_ instance-attribute ¶

coef_ instance-attribute ¶

coef_names_ instance-attribute ¶

sigma2_ instance-attribute ¶

loglik_ instance-attribute ¶

aic_ instance-attribute ¶

bic_ instance-attribute ¶

arma_ instance-attribute ¶

converged_ instance-attribute ¶

fitted_values_ instance-attribute ¶

in_sample_residuals_ instance-attribute ¶

var_coef_ instance-attribute ¶

n_features_in_ instance-attribute ¶

n_exog_names_in_ instance-attribute ¶

n_exog_features_in_ instance-attribute ¶

is_memory_reduced instance-attribute ¶

is_fitted instance-attribute ¶

best_params_ instance-attribute ¶

estimator_name_ instance-attribute ¶

fit ¶

predict ¶

predict_interval ¶

get_residuals ¶

get_fitted_values ¶

get_feature_importances ¶

get_score ¶

get_info_criteria ¶

get_params ¶

_set_params ¶

set_params ¶

summary ¶

reduce_memory ¶

skforecast.stats._sarimax.Sarimax ¶

order instance-attribute ¶

seasonal_order instance-attribute ¶

trend instance-attribute ¶

measurement_error instance-attribute ¶

time_varying_regression instance-attribute ¶

`stats`¶

order `instance-attribute` ¶

seasonal_order `instance-attribute` ¶

m `instance-attribute` ¶

include_mean `instance-attribute` ¶

transform_pars `instance-attribute` ¶

method `instance-attribute` ¶

n_cond `instance-attribute` ¶

SSinit `instance-attribute` ¶

optim_method `instance-attribute` ¶

optim_kwargs `instance-attribute` ¶

kappa `instance-attribute` ¶

max_p `instance-attribute` ¶

max_q `instance-attribute` ¶

max_P `instance-attribute` ¶

max_Q `instance-attribute` ¶

max_order `instance-attribute` ¶

max_d `instance-attribute` ¶

max_D `instance-attribute` ¶

start_p `instance-attribute` ¶

start_q `instance-attribute` ¶

start_P `instance-attribute` ¶

start_Q `instance-attribute` ¶

stationary `instance-attribute` ¶

seasonal `instance-attribute` ¶

ic `instance-attribute` ¶

stepwise `instance-attribute` ¶

nmodels `instance-attribute` ¶

trace `instance-attribute` ¶

approximation `instance-attribute` ¶

truncate `instance-attribute` ¶

test `instance-attribute` ¶

test_kwargs `instance-attribute` ¶

seasonal_test `instance-attribute` ¶

seasonal_test_kwargs `instance-attribute` ¶

allowdrift `instance-attribute` ¶

allowmean `instance-attribute` ¶

lambda_bc `instance-attribute` ¶

biasadj `instance-attribute` ¶

is_auto `instance-attribute` ¶

model_ `instance-attribute` ¶

y_train_ `instance-attribute` ¶

coef_ `instance-attribute` ¶

coef_names_ `instance-attribute` ¶

sigma2_ `instance-attribute` ¶

loglik_ `instance-attribute` ¶

aic_ `instance-attribute` ¶

bic_ `instance-attribute` ¶

arma_ `instance-attribute` ¶

converged_ `instance-attribute` ¶

fitted_values_ `instance-attribute` ¶

in_sample_residuals_ `instance-attribute` ¶

var_coef_ `instance-attribute` ¶

n_features_in_ `instance-attribute` ¶

n_exog_names_in_ `instance-attribute` ¶

n_exog_features_in_ `instance-attribute` ¶

is_memory_reduced `instance-attribute` ¶

is_fitted `instance-attribute` ¶

best_params_ `instance-attribute` ¶

estimator_name_ `instance-attribute` ¶

order `instance-attribute` ¶

seasonal_order `instance-attribute` ¶

trend `instance-attribute` ¶

measurement_error `instance-attribute` ¶

time_varying_regression `instance-attribute` ¶

mle_regression `instance-attribute` ¶

simple_differencing `instance-attribute` ¶

enforce_stationarity `instance-attribute` ¶

enforce_invertibility `instance-attribute` ¶

hamilton_representation `instance-attribute` ¶

concentrate_scale `instance-attribute` ¶

trend_offset `instance-attribute` ¶

use_exact_diffuse `instance-attribute` ¶

dates `instance-attribute` ¶

freq `instance-attribute` ¶

missing `instance-attribute` ¶

validate_specification `instance-attribute` ¶

method `instance-attribute` ¶

maxiter `instance-attribute` ¶

start_params `instance-attribute` ¶