Skip to content

stats

skforecast.stats._arima.Arima

Arima(
    order=(1, 0, 0),
    seasonal_order=(0, 0, 0),
    m=1,
    include_mean=True,
    transform_pars=True,
    method="CSS-ML",
    n_cond=None,
    SSinit="Gardner1980",
    optim_method="BFGS",
    optim_kwargs=None,
    kappa=1000000.0,
    max_p=5,
    max_q=5,
    max_P=2,
    max_Q=2,
    max_order=5,
    max_d=2,
    max_D=1,
    start_p=2,
    start_q=2,
    start_P=1,
    start_Q=1,
    stationary=False,
    seasonal=True,
    ic="aicc",
    stepwise=True,
    nmodels=94,
    trace=False,
    approximation=None,
    truncate=None,
    test="kpss",
    test_kwargs=None,
    seasonal_test="seas",
    seasonal_test_kwargs=None,
    allowdrift=True,
    allowmean=True,
    lambda_bc=None,
    biasadj=False,
)

Bases: BaseEstimator, RegressorMixin

Scikit-learn style wrapper for the ARIMA (AutoRegressive Integrated Moving Average) model and auto arima selection algorithm.

This estimator treats a univariate time series as input. Call fit(y) with a 1D array-like of observations in time order, then produce out-of-sample forecasts via predict(steps) and prediction intervals via predict_interval(steps, level=...). In-sample diagnostics are available through fitted_, residuals_() and summary().

Parameters:

Name Type Description Default
order tuple of int or None

The (p, d, q) order of the non-seasonal ARIMA model: - p: AR order (number of lag observations) - d: Degree of differencing (number of times to difference the series) - q: MA order (size of moving average window) If None, the order will be automatically selected using auto_arima during fitting.

(1, 0, 0)
seasonal_order tuple of int or None

The (P, D, Q) order of the seasonal component: - P: Seasonal AR order - D: Seasonal differencing order - Q: Seasonal MA order If None, the seasonal order will be automatically selected using auto_arima during fitting.

(0, 0, 0)
m int

Seasonal period (e.g., 12 for monthly data with yearly seasonality, 4 for quarterly data). Set to 1 for non-seasonal models.

1
include_mean bool

Whether to include a mean/intercept term in the model. Only applies when there is no differencing (d=0 and D=0).

True
transform_pars bool

Whether to transform parameters to ensure stationarity and invertibility during optimization.

True
method str

Estimation method. Options: - "CSS-ML": Conditional sum of squares for initial values, then maximum likelihood - "ML": Maximum likelihood only - "CSS": Conditional sum of squares only

"CSS-ML"
n_cond int

Number of initial observations to use for conditional sum of squares. If None, defaults to max(p + d*m + P*m, q + Q*m).

None
SSinit str

Method for state-space initialization. Options: - "Gardner1980": Gardner's method (default, more numerically stable) - "Rossignol2011": Rossignol's method (alternative)

"Gardner1980"
optim_method str

Optimization method passed to scipy.optimize.minimize. Common options include "BFGS", "L-BFGS-B", "Nelder-Mead", etc.

"BFGS"
optim_kwargs dict or None

Additional options passed to the optimizer (e.g., maxiter, ftol).

{'maxiter': 1000}
kappa float

Prior variance for diffuse states in the Kalman filter.

1e6
max_p int

Maximum AR order for automatic model selection.

5
max_q int

Maximum MA order for automatic model selection.

5
max_P int

Maximum seasonal AR order for automatic model selection.

2
max_Q int

Maximum seasonal MA order for automatic model selection.

2
max_order int

Maximum sum of p+q+P+Q for automatic model selection.

5
max_d int

Maximum non-seasonal differencing order for automatic selection.

2
max_D int

Maximum seasonal differencing order for automatic selection.

1
start_p int

Starting AR order for stepwise search.

2
start_q int

Starting MA order for stepwise search.

2
start_P int

Starting seasonal AR order for stepwise search.

1
start_Q int

Starting seasonal MA order for stepwise search.

1
stationary bool

Restrict automatic search to stationary models (d=D=0).

False
seasonal bool

Include seasonal components in automatic search.

True
ic str

Information criterion for automatic model selection: "aicc", "aic", or "bic".

"aicc"
stepwise bool

Use stepwise search (faster) or exhaustive grid search for automatic selection.

True
nmodels int

Maximum number of models to try in stepwise search.

94
trace bool

Print progress during automatic model selection.

False
approximation bool or None

Use CSS approximation during automatic search. If None, auto-determined based on data size.

None
truncate int or None

Truncate series to this length for approximation offset computation.

None
test str

Unit root test for automatic differencing determination: "kpss", "adf", or "pp".

"kpss"
test_kwargs dict or None

Additional arguments for unit root test.

None
seasonal_test str

Seasonal test for automatic seasonal differencing: "seas", "ocsb", "hegy", or "ch".

"seas"
seasonal_test_kwargs dict or None

Additional arguments for seasonal test.

None
allowdrift bool

Allow drift term in automatic selection when d+D=1.

True
allowmean bool

Allow mean term in automatic selection when d+D=0.

True
lambda_bc float, str, or None

Box-Cox transformation parameter: - None: No transformation - "auto": Automatically select lambda using Guerrero's method - float: Use the specified lambda value (0 = log transform)

None
biasadj bool

Bias adjustment for Box-Cox back-transformation (produces mean forecasts instead of median).

False

Attributes:

Name Type Description
order tuple of int

(p, d, q) non-seasonal ARIMA order stored on the estimator.

seasonal_order tuple of int

(P, D, Q) seasonal ARIMA order stored on the estimator.

m int

Seasonal period (e.g., 12 for monthly data).

include_mean bool

Whether a mean/intercept term is included in the model.

transform_pars bool

Whether parameters are transformed to enforce stationarity/invertibility.

method str

Estimation method (e.g., "CSS-ML", "ML", "CSS").

n_cond int or None

Number of observations used for conditional sum of squares (if any).

SSinit str

State-space initialization method (e.g., "Gardner1980").

optim_method str

Optimization method passed to the optimizer (e.g., "BFGS").

optim_kwargs dict or None

Additional optimizer options.

kappa float

Prior variance for diffuse states in the Kalman filter.

max_p int, default 5

Maximum AR order for automatic model selection.

max_q int, default 5

Maximum MA order for automatic model selection.

max_P int, default 2

Maximum seasonal AR order for automatic model selection.

max_Q int, default 2

Maximum seasonal MA order for automatic model selection.

max_order int, default 5

Maximum sum of p+q+P+Q for automatic model selection.

max_d int, default 2

Maximum non-seasonal differencing order for automatic selection.

max_D int, default 1

Maximum seasonal differencing order for automatic selection.

start_p int, default 2

Starting AR order for stepwise search.

start_q int, default 2

Starting MA order for stepwise search.

start_P int, default 1

Starting seasonal AR order for stepwise search.

start_Q int, default 1

Starting seasonal MA order for stepwise search.

stationary bool, default False

Restrict automatic search to stationary models (d=D=0).

seasonal bool, default True

Include seasonal components in automatic search.

ic str, default "aicc"

Information criterion for automatic model selection: "aicc", "aic", or "bic".

stepwise bool, default True

Use stepwise search (faster) or exhaustive grid search for automatic selection.

nmodels int, default 94

Maximum number of models to try in stepwise search.

trace bool, default False

Print progress during automatic model selection.

approximation bool or None, default None

Use CSS approximation during automatic search. If None, auto-determined based on data size.

truncate int or None, default None

Truncate series to this length for approximation offset computation.

test str, default "kpss"

Unit root test for automatic differencing determination: "kpss", "adf", or "pp".

test_kwargs dict or None, default None

Additional arguments for unit root test.

seasonal_test str, default "seas"

Seasonal test for automatic seasonal differencing: "seas", "ocsb", "hegy", or "ch".

seasonal_test_kwargs dict or None, default None

Additional arguments for seasonal test.

allowdrift bool, default True

Allow drift term in automatic selection when d+D=1.

allowmean bool, default True

Allow mean term in automatic selection when d+D=0.

lambda_bc float, str, or None, default None

Box-Cox transformation parameter: - None: No transformation - "auto": Automatically select lambda using Guerrero's method - float: Use the specified lambda value (0 = log transform)

biasadj bool, default False

Bias adjustment for Box-Cox back-transformation (produces mean forecasts instead of median). Only available for auto arima mode.

model_ dict

Dictionary containing the fitted ARIMA model with keys: - 'y': Original training series - 'fitted': In-sample fitted values - 'coef': Coefficient DataFrame - 'sigma2': Innovation variance - 'var_coef': Variance-covariance matrix - 'loglik': Log-likelihood - 'aic': Akaike Information Criterion - 'bic': Bayesian Information Criterion - 'arma': ARIMA specification [p, q, P, Q, m, d, D] - 'residuals': Model residuals - 'converged': Convergence status - 'model': State-space model dict - 'method': Estimation method string

y_train_ ndarray of shape (n_samples,)

Original training series used for fitting.

coef_ ndarray

Flattened array of fitted coefficients (AR, MA, exogenous, intercept if present).

coef_names_ list of str

Names of coefficients in coef_.

sigma2_ float

Innovation variance (residual variance).

loglik_ float

Log-likelihood of the fitted model.

aic_ float

Akaike Information Criterion value.

bic_ float or None

Bayesian Information Criterion value (may be None if not available).

arma_ list of int

ARIMA specification: [p, q, P, Q, m, d, D].

converged_ bool

Whether the optimization converged successfully.

n_features_in_ int

Number of features in the target series (always 1, for sklearn compatibility).

n_exog_names_in_ list

Names of exogenous features seen during fitting (None if no exog provided) or if exog was not a pandas DataFrame.

n_exog_features_in_ int

Number of exogenous features seen during fitting (0 if no exog provided).

fitted_values_ ndarray of shape (n_samples,)

In-sample fitted values.

in_sample_residuals_ ndarray of shape (n_samples,)

In-sample residuals (observed - fitted).

var_coef_ ndarray

Variance-covariance matrix of coefficients.

best_params_ dict or None

If auto arima was used, dictionary with 'order', 'seasonal_order' and m of the selected best model. Otherwise None.

is_auto bool

Flag indicating whether auto arima model selection is used.

is_memory_reduced bool

Flag indicating whether reduce_memory() has been called.

is_fitted bool

Flag indicating whether the estimator has been fitted.

estimator_name_ str

String identifier of the fitted model configuration (e.g., "Arima(1,1,1)(0,0,0)[1]"). This is updated after fitting to reflect the selected model.

Notes

The ARIMA model supports exogenous regressors which are incorporated directly into the likelihood function, unlike the two-step approach used in the ARAR model. This means the exogenous variables are modeled jointly with the ARMA errors, providing a more integrated treatment.

The model uses a state-space representation and the Kalman filter for likelihood computation and forecasting, which allows handling of missing values and provides efficient recursive prediction.

Methods:

Name Description
fit

Fit the ARIMA model to a univariate time series.

predict

Generate mean forecasts steps ahead.

predict_interval

Forecast with prediction intervals.

get_residuals

Get in-sample residuals (observed - fitted) from the ARIMA model.

get_fitted_values

Get in-sample fitted values from the ARIMA model.

get_feature_importances

Get feature importances for Arima model.

get_score

Compute R^2 score using in-sample fitted values.

get_info_criteria

Get the selected information criterion.

get_params

Get parameters for this estimator.

set_params

Set the parameters of this estimator and reset the fitted state.

summary

Print a summary of the fitted ARIMA model.

reduce_memory

Free memory by deleting large attributes after fitting.

Source code in skforecast\stats\_arima.py
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
def __init__(
    self,
    order: tuple[int, int, int] | None = (1, 0, 0),
    seasonal_order: tuple[int, int, int] | None = (0, 0, 0),
    m: int = 1,
    include_mean: bool = True,
    transform_pars: bool = True,
    method: str = "CSS-ML",
    n_cond: int | None = None,
    SSinit: str = "Gardner1980",
    optim_method: str = "BFGS",
    optim_kwargs: dict | None = None,
    kappa: float = 1e6,
    max_p: int = 5,
    max_q: int = 5,
    max_P: int = 2,
    max_Q: int = 2,
    max_order: int = 5,
    max_d: int = 2,
    max_D: int = 1,
    start_p: int = 2,
    start_q: int = 2,
    start_P: int = 1,
    start_Q: int = 1,
    stationary: bool = False,
    seasonal: bool = True,
    ic: str = "aicc",
    stepwise: bool = True,
    nmodels: int = 94,
    trace: bool = False,
    approximation: bool | None = None,
    truncate: int | None = None,
    test: str = "kpss",
    test_kwargs: dict | None = None,
    seasonal_test: str = "seas",
    seasonal_test_kwargs: dict | None = None,
    allowdrift: bool = True,
    allowmean: bool = True,
    lambda_bc: float | str | None = None,
    biasadj: bool = False,
):

    if order is not None and len(order) != 3:
        raise ValueError(
            f"`order` must be a tuple of length 3, got length {len(order)}"
        )
    if seasonal_order is not None and len(seasonal_order) != 3:
        raise ValueError(
            f"`seasonal_order` must be a tuple of length 3, got length {len(seasonal_order)}"
        )
    if not isinstance(m, int) or m < 1:
        raise ValueError("`m` must be a positive integer (seasonal period).")

    self.order                = order
    self.seasonal_order       = seasonal_order
    self.m                    = m
    self.include_mean         = include_mean
    self.transform_pars       = transform_pars
    self.method               = method
    self.n_cond               = n_cond
    self.SSinit               = SSinit
    self.optim_method         = optim_method
    self.optim_kwargs         = optim_kwargs
    self.kappa                = kappa
    self.max_p                = max_p
    self.max_q                = max_q
    self.max_P                = max_P
    self.max_Q                = max_Q
    self.max_order            = max_order
    self.max_d                = max_d
    self.max_D                = max_D
    self.start_p              = start_p
    self.start_q              = start_q
    self.start_P              = start_P
    self.start_Q              = start_Q
    self.stationary           = stationary
    self.seasonal             = seasonal
    self.ic                   = ic
    self.stepwise             = stepwise
    self.nmodels              = nmodels
    self.trace                = trace
    self.approximation        = approximation
    self.truncate             = truncate
    self.test                 = test
    self.test_kwargs          = test_kwargs
    self.seasonal_test        = seasonal_test
    self.seasonal_test_kwargs = seasonal_test_kwargs
    self.allowdrift           = allowdrift
    self.allowmean            = allowmean
    self.lambda_bc            = lambda_bc
    self.biasadj              = biasadj       

    self.is_auto              = order is None or seasonal_order is None
    self.model_               = None
    self.y_train_             = None
    self.coef_                = None
    self.coef_names_          = None
    self.sigma2_              = None
    self.loglik_              = None
    self.aic_                 = None
    self.bic_                 = None
    self.arma_                = None
    self.converged_           = None
    self.fitted_values_       = None
    self.in_sample_residuals_ = None
    self.var_coef_            = None
    self.n_features_in_       = None
    self.n_exog_names_in_     = None
    self.n_exog_features_in_  = None
    self.is_memory_reduced    = False
    self.is_fitted            = False
    self.best_params_         = None

    if self.optim_kwargs is None:
        self.optim_kwargs = {'maxiter': 1000}

    if self.is_auto:
        estimator_name_ = "AutoArima()"
    else:
        p, d, q = self.order
        P, D, Q = self.seasonal_order
        if P == 0 and D == 0 and Q == 0:
            estimator_name_ = f"Arima({p},{d},{q})"
        else:
            estimator_name_ = f"Arima({p},{d},{q})({P},{D},{Q})[{self.m}]"

    self.estimator_name_ = estimator_name_

order instance-attribute

order = order

seasonal_order instance-attribute

seasonal_order = seasonal_order

m instance-attribute

m = m

include_mean instance-attribute

include_mean = include_mean

transform_pars instance-attribute

transform_pars = transform_pars

method instance-attribute

method = method

n_cond instance-attribute

n_cond = n_cond

SSinit instance-attribute

SSinit = SSinit

optim_method instance-attribute

optim_method = optim_method

optim_kwargs instance-attribute

optim_kwargs = optim_kwargs

kappa instance-attribute

kappa = kappa

max_p instance-attribute

max_p = max_p

max_q instance-attribute

max_q = max_q

max_P instance-attribute

max_P = max_P

max_Q instance-attribute

max_Q = max_Q

max_order instance-attribute

max_order = max_order

max_d instance-attribute

max_d = max_d

max_D instance-attribute

max_D = max_D

start_p instance-attribute

start_p = start_p

start_q instance-attribute

start_q = start_q

start_P instance-attribute

start_P = start_P

start_Q instance-attribute

start_Q = start_Q

stationary instance-attribute

stationary = stationary

seasonal instance-attribute

seasonal = seasonal

ic instance-attribute

ic = ic

stepwise instance-attribute

stepwise = stepwise

nmodels instance-attribute

nmodels = nmodels

trace instance-attribute

trace = trace

approximation instance-attribute

approximation = approximation

truncate instance-attribute

truncate = truncate

test instance-attribute

test = test

test_kwargs instance-attribute

test_kwargs = test_kwargs

seasonal_test instance-attribute

seasonal_test = seasonal_test

seasonal_test_kwargs instance-attribute

seasonal_test_kwargs = seasonal_test_kwargs

allowdrift instance-attribute

allowdrift = allowdrift

allowmean instance-attribute

allowmean = allowmean

lambda_bc instance-attribute

lambda_bc = lambda_bc

biasadj instance-attribute

biasadj = biasadj

is_auto instance-attribute

is_auto = order is None or seasonal_order is None

model_ instance-attribute

model_ = None

y_train_ instance-attribute

y_train_ = None

coef_ instance-attribute

coef_ = None

coef_names_ instance-attribute

coef_names_ = None

sigma2_ instance-attribute

sigma2_ = None

loglik_ instance-attribute

loglik_ = None

aic_ instance-attribute

aic_ = None

bic_ instance-attribute

bic_ = None

arma_ instance-attribute

arma_ = None

converged_ instance-attribute

converged_ = None

fitted_values_ instance-attribute

fitted_values_ = None

in_sample_residuals_ instance-attribute

in_sample_residuals_ = None

var_coef_ instance-attribute

var_coef_ = None

n_features_in_ instance-attribute

n_features_in_ = None

n_exog_names_in_ instance-attribute

n_exog_names_in_ = None

n_exog_features_in_ instance-attribute

n_exog_features_in_ = None

is_memory_reduced instance-attribute

is_memory_reduced = False

is_fitted instance-attribute

is_fitted = False

best_params_ instance-attribute

best_params_ = None

estimator_name_ instance-attribute

estimator_name_ = estimator_name_

fit

fit(y, exog=None, suppress_warnings=False)

Fit the ARIMA model to a univariate time series.

If order or seasonal_order were not specified during initialization (i.e., set to None), this method will automatically determine the best model using auto arima with stepwise search.

Parameters:

Name Type Description Default
y pandas Series, numpy ndarray of shape (n_samples,)

Time-ordered numeric sequence.

required
exog pandas Series, pandas DataFrame, numpy ndarray of shape (n_samples, n_exog_features)

Exogenous regressors to include in the model. These are incorporated directly into the ARIMA likelihood function.

None
suppress_warnings bool

If True, suppress warnings during fitting (e.g., convergence warnings).

False

Returns:

Name Type Description
self Arima

Fitted estimator. After fitting with automatic model selection, the selected order and seasonal_order are stored in the respective attributes, and estimator_selected_id_ is updated with the chosen model.

Source code in skforecast\stats\_arima.py
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
def fit(
    self, 
    y: np.ndarray | pd.Series, 
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None,
    suppress_warnings: bool = False
) -> "Arima":
    """
    Fit the ARIMA model to a univariate time series.

    If `order` or `seasonal_order` were not specified during initialization 
    (i.e., set to None), this method will automatically determine the best 
    model using auto arima with stepwise search.

    Parameters
    ----------
    y : pandas Series, numpy ndarray of shape (n_samples,)
        Time-ordered numeric sequence.
    exog : pandas Series, pandas DataFrame,  numpy ndarray of shape (n_samples, n_exog_features), default None
        Exogenous regressors to include in the model. These are incorporated 
        directly into the ARIMA likelihood function.
    suppress_warnings : bool, default False
        If True, suppress warnings during fitting (e.g., convergence warnings).

    Returns
    -------
    self : Arima
        Fitted estimator. After fitting with automatic model selection, the 
        selected `order` and `seasonal_order` are stored in the respective 
        attributes, and `estimator_selected_id_` is updated with the chosen model.

    """

    self.is_auto              = self.order is None or self.seasonal_order is None
    self.model_               = None
    self.y_train_             = None
    self.coef_                = None
    self.coef_names_          = None
    self.sigma2_              = None
    self.loglik_              = None
    self.aic_                 = None
    self.bic_                 = None
    self.arma_                = None
    self.converged_           = None
    self.fitted_values_       = None
    self.in_sample_residuals_ = None
    self.var_coef_            = None
    self.n_features_in_       = None
    self.n_exog_names_in_     = None
    self.n_exog_features_in_  = None
    self.is_memory_reduced    = False
    self.is_fitted            = False
    self.best_params_         = None

    if not isinstance(y, (np.ndarray, pd.Series)):
        raise TypeError("`y` must be a pandas Series or numpy array.")

    if not isinstance(exog, (type(None), pd.Series, pd.DataFrame, np.ndarray)):
        raise TypeError("`exog` must be a pandas Series, DataFrame, numpy array, or None.")

    y = np.asarray(y, dtype=float)
    if y.ndim == 2 and y.shape[1] == 1:
        y = y.ravel()
    elif y.ndim != 1:
        raise ValueError("`y` must be 1-dimensional.")

    exog_names_in_ = None
    if exog is not None:
        if isinstance(exog, pd.DataFrame):
            exog_names_in_ = list(exog.columns)
        exog = np.asarray(exog, dtype=float)
        if exog.ndim == 1:
            exog = exog.reshape(-1, 1)
        elif exog.ndim != 2:
            raise ValueError("`exog` must be 1- or 2-dimensional.")

        if len(exog) != len(y):
            raise ValueError(
                f"Length of `exog` ({len(exog)}) does not match length of `y` ({len(y)})."
            )

    ctx = (warnings.catch_warnings() if suppress_warnings else nullcontext())
    with ctx:
        if suppress_warnings:
            warnings.simplefilter("ignore")

        if self.is_auto:
            self.model_ = auto_arima(
                y                  = y,
                m                  = self.m,
                d                  = self.order[1] if self.order is not None else None,
                D                  = self.seasonal_order[1] if self.seasonal_order is not None else None,
                max_p              = self.max_p,
                max_q              = self.max_q,
                max_P              = self.max_P,
                max_Q              = self.max_Q,
                max_order          = self.max_order,
                max_d              = self.max_d,
                max_D              = self.max_D,
                start_p            = self.start_p,
                start_q            = self.start_q,
                start_P            = self.start_P,
                start_Q            = self.start_Q,
                stationary         = self.stationary,
                seasonal           = self.seasonal,
                ic                 = self.ic,
                stepwise           = self.stepwise,
                nmodels            = self.nmodels,
                trace              = self.trace,
                approximation      = self.approximation,
                method             = self.method,
                truncate           = self.truncate,
                xreg               = exog,
                test               = self.test,
                test_args          = self.test_kwargs,
                seasonal_test      = self.seasonal_test,
                seasonal_test_args = self.seasonal_test_kwargs,
                allowdrift         = self.allowdrift,
                allowmean          = self.allowmean,
                lambda_bc          = self.lambda_bc,
                biasadj            = self.biasadj,
                SSinit             = self.SSinit,
                kappa              = self.kappa
            )

            best_model_order_ = (
                self.model_['arma'][0],
                self.model_['arma'][5],
                self.model_['arma'][1]
            )
            best_seasonal_order_ = (
                self.model_['arma'][2],
                self.model_['arma'][6],
                self.model_['arma'][3]
            )
            self.best_params_ = {
                'order': best_model_order_,
                'seasonal_order': best_seasonal_order_,
                'm': self.m
            }

            # NOTE: Only needed to update `estimator_name_` when auto arima is used
            p, d, q = best_model_order_
            P, D, Q = best_seasonal_order_
            if P == 0 and D == 0 and Q == 0:
                self.estimator_name_ = f"AutoArima({p},{d},{q})"
            else:
                self.estimator_name_ = f"AutoArima({p},{d},{q})({P},{D},{Q})[{self.m}]"

        else:
            self.model_ = arima(
                x              = y,
                m              = self.m,
                order          = self.order,
                seasonal       = self.seasonal_order,
                xreg           = exog,
                include_mean   = self.include_mean,
                transform_pars = self.transform_pars,
                fixed          = None,
                init           = None,
                method         = self.method,
                n_cond         = self.n_cond,
                SSinit         = self.SSinit,
                optim_method   = self.optim_method,
                opt_options    = self.optim_kwargs,
                kappa          = self.kappa
            )

    self.y_train_             = self.model_['y']
    self.coef_                = self.model_['coef'].to_numpy().ravel()
    self.coef_names_          = list(self.model_['coef'].columns)
    self.sigma2_              = self.model_['sigma2']
    self.loglik_              = self.model_['loglik']
    self.aic_                 = self.model_['aic']
    self.bic_                 = self.model_['bic']
    self.arma_                = self.model_['arma']
    self.converged_           = self.model_['converged']
    self.fitted_values_       = self.model_['fitted']
    self.in_sample_residuals_ = self.model_['residuals']
    self.var_coef_            = self.model_['var_coef']
    self.n_exog_names_in_     = exog_names_in_
    self.n_exog_features_in_  = exog.shape[1] if exog is not None else 0
    self.n_features_in_       = 1
    self.is_memory_reduced    = False
    self.is_fitted            = True

    if exog_names_in_ is not None:
        n_exog = len(exog_names_in_)
        self.coef_names_ = self.coef_names_[:-n_exog] + exog_names_in_

    return self

predict

predict(steps, exog=None)

Generate mean forecasts steps ahead.

Parameters:

Name Type Description Default
steps int

Forecast horizon (must be > 0).

required
exog ndarray, Series or DataFrame of shape (steps, n_exog_features)

Exogenous regressors for the forecast period. Must have the same number of features as used during fitting.

None

Returns:

Name Type Description
predictions ndarray of shape (steps,)

Point forecasts for steps 1..steps.

Raises:

Type Description
ValueError

If model hasn't been fitted, steps <= 0, or exog shape is incorrect.

Source code in skforecast\stats\_arima.py
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
@check_is_fitted
def predict(
    self, 
    steps: int, 
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None
) -> np.ndarray:
    """
    Generate mean forecasts steps ahead.

    Parameters
    ----------
    steps : int
        Forecast horizon (must be > 0).
    exog : ndarray, Series or DataFrame of shape (steps, n_exog_features), default None
        Exogenous regressors for the forecast period. Must have the same 
        number of features as used during fitting.

    Returns
    -------
    predictions : ndarray of shape (steps,)
        Point forecasts for steps 1..steps.

    Raises
    ------
    ValueError
        If model hasn't been fitted, steps <= 0, or exog shape is incorrect.

    """

    if not isinstance(steps, (int, np.integer)) or steps <= 0:
        raise ValueError("`steps` must be a positive integer.")

    if exog is not None:
        exog = np.asarray(exog, dtype=float)
        if exog.ndim == 1:
            exog = exog.reshape(-1, 1)
        elif exog.ndim != 2:
            raise ValueError("`exog` must be 1- or 2-dimensional.")

        if len(exog) != steps:
            raise ValueError(
                f"Length of `exog` ({len(exog)}) must match `steps` ({steps})."
            )

        if exog.shape[1] != self.n_exog_features_in_:
            raise ValueError(
                f"Number of exogenous features ({exog.shape[1]}) does not match "
                f"the number used during fitting ({self.n_exog_features_in_})."
            )
    elif self.n_exog_features_in_ > 0:
        raise ValueError(
            f"Model was fitted with {self.n_exog_features_in_} exogenous features, "
            f"but `exog` was not provided for prediction."
        )

    if self.is_auto:
        predictions = forecast_arima(
            model   = self.model_,
            h       = steps,
            xreg    = exog
        )['mean']
    else:
        predictions = predict_arima(
            model   = self.model_,
            n_ahead = steps,
            newxreg = exog,
            se_fit  = False
        )['mean']

    return predictions

predict_interval

predict_interval(
    steps=1,
    level=None,
    alpha=None,
    as_frame=True,
    exog=None,
)

Forecast with prediction intervals.

Parameters:

Name Type Description Default
steps int

Forecast horizon.

1
level list or tuple of float

Confidence levels in percent (e.g., 80 for 80% intervals). If None and alpha is None, defaults to (80, 95). Cannot be specified together with alpha.

None
alpha float

The significance level for the prediction interval. If specified, the confidence interval will be (1 - alpha) * 100%. For example, alpha=0.05 gives 95% intervals. Cannot be specified together with level.

None
as_frame bool

If True, return a tidy DataFrame with columns 'mean', 'lower_', 'upper_' for each level L. If False, return a NumPy ndarray.

True
exog ndarray, Series or DataFrame of shape (steps, n_exog_features)

Exogenous regressors for the forecast period.

None

Returns:

Name Type Description
predictions numpy ndarray, pandas DataFrame

If as_frame=True, pandas DataFrame with columns 'mean', 'lower_', 'upper_' for each level L. If as_frame=False, numpy ndarray.

Raises:

Type Description
ValueError

If model hasn't been fitted, steps <= 0, or exog shape is incorrect.

Notes

Prediction intervals are computed using the standard errors from the Kalman filter and assuming normally distributed innovations. The intervals fully account for both parameter uncertainty (through the variance-covariance matrix) and forecast uncertainty.

Source code in skforecast\stats\_arima.py
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
@check_is_fitted
def predict_interval(
    self,
    steps: int = 1,
    level: list[float] | tuple[float, ...] | None = None,
    alpha: float | None = None,
    as_frame: bool = True,
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None
) -> np.ndarray | pd.DataFrame:
    """
    Forecast with prediction intervals.

    Parameters
    ----------
    steps : int, default 1
        Forecast horizon.
    level : list or tuple of float, default None
        Confidence levels in percent (e.g., 80 for 80% intervals).
        If None and alpha is None, defaults to (80, 95).
        Cannot be specified together with `alpha`.
    alpha : float, default None
        The significance level for the prediction interval. 
        If specified, the confidence interval will be (1 - alpha) * 100%.
        For example, alpha=0.05 gives 95% intervals.
        Cannot be specified together with `level`.
    as_frame : bool, default True
        If True, return a tidy DataFrame with columns 'mean', 'lower_<L>',
        'upper_<L>' for each level L. If False, return a NumPy ndarray.
    exog : ndarray, Series or DataFrame of shape (steps, n_exog_features), default None
        Exogenous regressors for the forecast period.

    Returns
    -------
    predictions : numpy ndarray, pandas DataFrame
        If as_frame=True, pandas DataFrame with columns 'mean', 'lower_<L>',
        'upper_<L>' for each level L. If as_frame=False, numpy ndarray.

    Raises
    ------
    ValueError
        If model hasn't been fitted, steps <= 0, or exog shape is incorrect.

    Notes
    -----
    Prediction intervals are computed using the standard errors from the 
    Kalman filter and assuming normally distributed innovations. The intervals 
    fully account for both parameter uncertainty (through the variance-covariance 
    matrix) and forecast uncertainty.

    """

    if not isinstance(steps, (int, np.integer)) or steps <= 0:
        raise ValueError("`steps` must be a positive integer.")

    if level is not None and alpha is not None:
        raise ValueError(
            "Cannot specify both `level` and `alpha`. Use one or the other."
        )

    if alpha is not None:
        if not 0 < alpha < 1:
            raise ValueError("`alpha` must be between 0 and 1.")
        level = [(1 - alpha) * 100]
    elif level is None:
        level = (80, 95)

    if isinstance(level, (int, float, np.number)):
        level = [level]
    else:
        level = list(level)

    if exog is not None:
        exog = np.asarray(exog, dtype=float)
        if exog.ndim == 1:
            exog = exog.reshape(-1, 1)
        elif exog.ndim != 2:
            raise ValueError("`exog` must be 1- or 2-dimensional.")

        if len(exog) != steps:
            raise ValueError(
                f"Length of `exog` ({len(exog)}) must match `steps` ({steps})."
            )

        if exog.shape[1] != self.n_exog_features_in_:
            raise ValueError(
                f"Number of exogenous features ({exog.shape[1]}) does not match "
                f"the number used during fitting ({self.n_exog_features_in_})."
            )
    elif self.n_exog_features_in_ > 0:
        raise ValueError(
            f"Model was fitted with {self.n_exog_features_in_} exogenous features, "
            f"but `exog` was not provided for prediction."
        )

    if self.is_auto:
        raw_preds = forecast_arima(
            model   = self.model_,
            h       = steps,
            xreg    = exog,
            level   = level
        )
    else:
        raw_preds = predict_arima(
            model   = self.model_,
            n_ahead = steps,
            newxreg = exog,
            se_fit  = True,
            level   = level
        )

    levels = raw_preds['level']
    n_levels = len(levels)
    predictions = np.empty((steps, 1 + 2 * n_levels), dtype=float)
    predictions[:, 0] = raw_preds['mean']
    predictions[:, 1::2] = raw_preds['lower']
    predictions[:, 2::2] = raw_preds['upper']

    if as_frame:
        col_names = ["mean"]
        for level in levels:
            level = int(level)
            col_names.append(f"lower_{level}")
            col_names.append(f"upper_{level}")

        predictions = pd.DataFrame(
            data    = predictions,
            columns = col_names,
            index   = pd.RangeIndex(1, steps + 1, name="step")
        )

    return predictions

get_residuals

get_residuals()

Get in-sample residuals (observed - fitted) from the ARIMA model.

Returns:

Name Type Description
residuals ndarray of shape (n_samples,)

In-sample residuals.

Raises:

Type Description
NotFittedError

If the model has not been fitted.

RuntimeError

If reduce_memory() has been called (residuals are no longer available).

Source code in skforecast\stats\_arima.py
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
@check_is_fitted
def get_residuals(self) -> np.ndarray:
    """
    Get in-sample residuals (observed - fitted) from the ARIMA model.

    Returns
    -------
    residuals : ndarray of shape (n_samples,)
        In-sample residuals.

    Raises
    ------
    NotFittedError
        If the model has not been fitted.
    RuntimeError
        If reduce_memory() has been called (residuals are no longer available).

    """

    check_memory_reduced(self, method_name='get_residuals')
    return self.in_sample_residuals_

get_fitted_values

get_fitted_values()

Get in-sample fitted values from the ARIMA model.

Returns:

Name Type Description
fitted ndarray of shape (n_samples,)

In-sample fitted values.

Raises:

Type Description
NotFittedError

If the model has not been fitted.

RuntimeError

If reduce_memory() has been called (fitted values are no longer available).

Source code in skforecast\stats\_arima.py
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
@check_is_fitted
def get_fitted_values(self) -> np.ndarray:
    """
    Get in-sample fitted values from the ARIMA model.

    Returns
    -------
    fitted : ndarray of shape (n_samples,)
        In-sample fitted values.

    Raises
    ------
    NotFittedError
        If the model has not been fitted.
    RuntimeError
        If reduce_memory() has been called (fitted values are no longer available).

    """

    check_memory_reduced(self, method_name='get_fitted_values')
    return self.fitted_values_

get_feature_importances

get_feature_importances()

Get feature importances for Arima model.

Source code in skforecast\stats\_arima.py
858
859
860
861
862
863
864
865
@check_is_fitted
def get_feature_importances(self) -> pd.DataFrame:
    """Get feature importances for Arima model."""
    importances = pd.DataFrame({
        'feature': self.coef_names_,
        'importance': self.coef_
    })
    return importances

get_score

get_score(y=None)

Compute R^2 score using in-sample fitted values.

Parameters:

Name Type Description Default
y ignored

Present for API compatibility with sklearn.

None

Returns:

Name Type Description
score float

Coefficient of determination (R^2).

Source code in skforecast\stats\_arima.py
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
@check_is_fitted
def get_score(self, y: None = None) -> float:
    """
    Compute R^2 score using in-sample fitted values.

    Parameters
    ----------
    y : ignored
        Present for API compatibility with sklearn.

    Returns
    -------
    score : float
        Coefficient of determination (R^2).

    """

    check_memory_reduced(self, method_name='get_score')

    y = self.y_train_
    fitted = self.fitted_values_

    # Handle NaN values if any
    mask = ~(np.isnan(y) | np.isnan(fitted))
    if mask.sum() < 2:
        return np.nan

    ss_res = np.sum((y[mask] - fitted[mask]) ** 2)
    ss_tot = np.sum((y[mask] - y[mask].mean()) ** 2) + np.finfo(float).eps

    return 1.0 - ss_res / ss_tot

get_info_criteria

get_info_criteria(criteria='aic')

Get the selected information criterion.

Parameters:

Name Type Description Default
criteria str

The information criterion to retrieve. Valid options are {'aic', 'bic'}.

'aic'

Returns:

Name Type Description
metric float

The value of the selected information criterion.

Source code in skforecast\stats\_arima.py
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
@check_is_fitted
def get_info_criteria(self, criteria: str = 'aic') -> float:
    """
    Get the selected information criterion.

    Parameters
    ----------
    criteria : str, default 'aic'
        The information criterion to retrieve. Valid options are 
        {'aic', 'bic'}.

    Returns
    -------
    metric : float
        The value of the selected information criterion.

    """

    if criteria not in ['aic', 'bic']:
        raise ValueError(
            f"Invalid value for `criteria`: '{criteria}'. "
            f"Valid options are 'aic' and 'bic'."
        )

    if criteria == 'aic':
        value = self.aic_
    elif criteria == 'bic':
        # NOTE: BIC may be not available. This may occur when the model did
        # not converge or other estimation issues.
        value = self.bic_ if self.bic_ is not None else np.nan

    return value

get_params

get_params(deep=True)

Get parameters for this estimator.

Parameters:

Name Type Description Default
deep bool

If True, will return the parameters for this estimator and contained subobjects that are estimators.

True

Returns:

Name Type Description
params dict

Parameter names mapped to their values.

Source code in skforecast\stats\_arima.py
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
def get_params(self, deep: bool = True) -> dict:
    """
    Get parameters for this estimator.

    Parameters
    ----------
    deep : bool, default True
        If True, will return the parameters for this estimator and
        contained subobjects that are estimators.

    Returns
    -------
    params : dict
        Parameter names mapped to their values.
    """
    return {
        "order": self.order,
        "seasonal_order": self.seasonal_order,
        "m": self.m,
        "include_mean": self.include_mean,
        "transform_pars": self.transform_pars,
        "method": self.method,
        "n_cond": self.n_cond,
        "SSinit": self.SSinit,
        "optim_method": self.optim_method,
        "optim_kwargs": self.optim_kwargs,
        "kappa": self.kappa,
    }

_set_params

_set_params(**params)

Set the parameters of this estimator. Internal method without resetting the fitted state. This method is intended for internal use only, please use set_params() instead.

Parameters:

Name Type Description Default
**params dict

Estimator parameters.

{}

Returns:

Type Description
None
Source code in skforecast\stats\_arima.py
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
def _set_params(self, **params) -> None:
    """
    Set the parameters of this estimator. Internal method without resetting 
    the fitted state. This method is intended for internal use only, please 
    use `set_params()` instead.

    Parameters
    ----------
    **params : dict
        Estimator parameters.

    Returns
    -------
    None

    """

    for key, value in params.items():
        setattr(self, key, value)

    self.is_auto = self.order is None or self.seasonal_order is None
    if self.is_auto:
        estimator_name_ = "AutoArima()"
    else:
        p, d, q = self.order
        P, D, Q = self.seasonal_order
        if P == 0 and D == 0 and Q == 0:
            estimator_name_ = f"Arima({p},{d},{q})"
        else:
            estimator_name_ = f"Arima({p},{d},{q})({P},{D},{Q})[{self.m}]"

    self.estimator_name_ = estimator_name_

set_params

set_params(**params)

Set the parameters of this estimator and reset the fitted state.

This method resets the estimator to its unfitted state whenever parameters are changed, requiring the model to be refitted before making predictions.

Parameters:

Name Type Description Default
**params dict

Estimator parameters. Valid parameter keys are: 'order', 'seasonal_order', 'm', 'include_mean', 'transform_pars', 'method', 'n_cond', 'SSinit', 'optim_method', 'optim_kwargs', 'kappa'.

{}

Returns:

Name Type Description
self Arima

The estimator with updated parameters and reset state.

Raises:

Type Description
ValueError

If any parameter key is invalid.

Source code in skforecast\stats\_arima.py
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
def set_params(self, **params) -> "Arima":
    """
    Set the parameters of this estimator and reset the fitted state.

    This method resets the estimator to its unfitted state whenever parameters
    are changed, requiring the model to be refitted before making predictions.

    Parameters
    ----------
    **params : dict
        Estimator parameters. Valid parameter keys are: 'order', 'seasonal_order',
        'm', 'include_mean', 'transform_pars', 'method', 'n_cond', 'SSinit',
        'optim_method', 'optim_kwargs', 'kappa'.

    Returns
    -------
    self : Arima
        The estimator with updated parameters and reset state.

    Raises
    ------
    ValueError
        If any parameter key is invalid.

    """

    valid_params = {
        'order', 'seasonal_order', 'm', 'include_mean', 'transform_pars',
        'method', 'n_cond', 'SSinit', 'optim_method', 'optim_kwargs', 'kappa'
    }
    for key in params.keys():
        if key not in valid_params:
            raise ValueError(
                f"Invalid parameter '{key}'. Valid parameters are: {valid_params}"
            )

    self._set_params(**params)

    fitted_attrs = [
        'model_', 'y_train_', 'coef_', 'coef_names_', 'sigma2_', 'loglik_',
        'aic_', 'bic_', 'arma_', 'converged_', 'fitted_values_', 'in_sample_residuals_',
        'var_coef_', 'n_features_in_', 'n_exog_features_in_', 'n_exog_names_in_'
    ]
    for attr in fitted_attrs:
        setattr(self, attr, None)

    self.is_memory_reduced = False
    self.is_fitted = False

    return self

summary

summary()

Print a summary of the fitted ARIMA model. Includes model specification, coefficients, fit statistics, and residual diagnostics. If reduce_memory() has been called, summary information will be limited.

Source code in skforecast\stats\_arima.py
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
@check_is_fitted
def summary(self) -> None:
    """
    Print a summary of the fitted ARIMA model.
    Includes model specification, coefficients, fit statistics, and residual diagnostics.
    If reduce_memory() has been called, summary information will be limited.
    """

    print("ARIMA Model Summary")
    print("=" * 60)
    print(f"Model     : {self.estimator_name_}")
    print(f"Method    : {self.model_['method']}")
    print(f"Converged : {self.converged_}")
    print()

    print("Coefficients:")
    print("-" * 60)
    for i, name in enumerate(self.coef_names_):
        # Extract standard error from variance-covariance matrix
        if self.var_coef_ is not None and i < self.var_coef_.shape[0] and i < self.var_coef_.shape[1]:
            se = np.sqrt(self.var_coef_[i, i])
            t_stat = self.coef_[i] / se if se > 0 else np.nan
            print(f"  {name:15s}: {self.coef_[i]:10.4f}  (SE: {se:8.4f}, t: {t_stat:8.2f})")
        else:
            print(f"  {name:15s}: {self.coef_[i]:10.4f}")
    print()

    print("Model fit statistics:")
    print(f"  sigma^2:             {self.sigma2_:.6f}")
    print(f"  Log-likelihood:      {self.loglik_:.2f}")
    print(f"  AIC:                 {self.aic_:.2f}")
    if self.bic_ is not None:
        print(f"  BIC:                 {self.bic_:.2f}")
    else:
        print(f"  BIC:                 N/A")
    print()

    if not self.is_memory_reduced:
        print("Residual statistics:")
        print(f"  Mean:                {np.mean(self.in_sample_residuals_):.6f}")
        print(f"  Std Dev:             {np.std(self.in_sample_residuals_, ddof=1):.6f}")
        print(f"  MAE:                 {np.mean(np.abs(self.in_sample_residuals_)):.6f}")
        print(f"  RMSE:                {np.sqrt(np.mean(self.in_sample_residuals_**2)):.6f}")
        print()

        print("Time Series Summary Statistics:")
        print(f"Number of observations: {len(self.y_train_)}")
        print(f"  Mean:                 {np.mean(self.y_train_):.4f}")
        print(f"  Std Dev:              {np.std(self.y_train_, ddof=1):.4f}")
        print(f"  Min:                  {np.min(self.y_train_):.4f}")
        print(f"  25%:                  {np.percentile(self.y_train_, 25):.4f}")
        print(f"  Median:               {np.median(self.y_train_):.4f}")
        print(f"  75%:                  {np.percentile(self.y_train_, 75):.4f}")
        print(f"  Max:                  {np.max(self.y_train_):.4f}")

reduce_memory

reduce_memory()

Free memory by deleting large attributes after fitting.

This method removes fitted values, residuals, and other intermediate results that are not strictly necessary for prediction. After calling this method, certain diagnostic functions (like get_residuals(), get_fitted_values(), summary()) will no longer work, but prediction methods will continue to function.

Call this method only if you need to reduce memory usage and don't need access to diagnostic information.

Returns:

Name Type Description
self Arima

The estimator with reduced memory footprint.

Source code in skforecast\stats\_arima.py
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
@check_is_fitted
def reduce_memory(self) -> "Arima":
    """
    Free memory by deleting large attributes after fitting.

    This method removes fitted values, residuals, and other intermediate 
    results that are not strictly necessary for prediction. After calling 
    this method, certain diagnostic functions (like get_residuals(), 
    get_fitted_values(), summary()) will no longer work, but prediction 
    methods will continue to function.

    Call this method only if you need to reduce memory usage and don't 
    need access to diagnostic information.

    Returns
    -------
    self : Arima
        The estimator with reduced memory footprint.

    """

    attrs_to_delete = ['y_train_', 'fitted_values_', 'in_sample_residuals_']

    for attr in attrs_to_delete:
        if hasattr(self, attr):
            delattr(self, attr)

    self.is_memory_reduced = True

    warnings.warn(
        "Memory reduced. Diagnostic methods (get_residuals, get_fitted_values, "
        "summary, get_score) are no longer available. Prediction methods remain functional.",
        UserWarning
    )

    return self

skforecast.stats._sarimax.Sarimax

Sarimax(
    order=(1, 0, 0),
    seasonal_order=(0, 0, 0, 0),
    trend=None,
    measurement_error=False,
    time_varying_regression=False,
    mle_regression=True,
    simple_differencing=False,
    enforce_stationarity=True,
    enforce_invertibility=True,
    hamilton_representation=False,
    concentrate_scale=False,
    trend_offset=1,
    use_exact_diffuse=False,
    dates=None,
    freq=None,
    missing="none",
    validate_specification=True,
    method="lbfgs",
    maxiter=50,
    start_params=None,
    disp=False,
    sm_init_kwargs={},
    sm_fit_kwargs={},
    sm_predict_kwargs={},
)

Bases: BaseEstimator, RegressorMixin

A universal scikit-learn style wrapper for statsmodels SARIMAX.

This class wraps the statsmodels.tsa.statespace.sarimax.SARIMAX model [1]_ [2]_ to follow the scikit-learn style. The following docstring is based on the statsmodels documentation and it is highly recommended to visit their site for the best level of detail.

Parameters:

Name Type Description Default
order tuple

The (p,d,q) order of the model for the number of AR parameters, differences, and MA parameters.

  • d must be an integer indicating the integration order of the process.
  • p and q may either be an integers indicating the AR and MA orders (so that all lags up to those orders are included) or else iterables giving specific AR and / or MA lags to include.
(1, 0, 0)
seasonal_order tuple

The (P,D,Q,s) order of the seasonal component of the model for the AR parameters, differences, MA parameters, and periodicity.

  • D must be an integer indicating the integration order of the process.
  • P and Q may either be an integers indicating the AR and MA orders (so that all lags up to those orders are included) or else iterables giving specific AR and / or MA lags to include.
  • s is an integer giving the periodicity (number of periods in season), often it is 4 for quarterly data or 12 for monthly data.
(0, 0, 0, 0)
trend str

Parameter controlling the deterministic trend polynomial A(t).

  • 'c' indicates a constant (i.e. a degree zero component of the trend polynomial).
  • 't' indicates a linear trend with time.
  • 'ct' indicates both, 'c' and 't'.
  • Can also be specified as an iterable defining the non-zero polynomial exponents to include, in increasing order. For example, [1,1,0,1] denotes a + b*t + ct^3.
None
measurement_error bool

Whether or not to assume the endogenous observations y were measured with error.

False
time_varying_regression bool

Used when an explanatory variables, exog, are provided to select whether or not coefficients on the exogenous estimators are allowed to vary over time.

False
mle_regression bool

Whether or not to use estimate the regression coefficients for the exogenous variables as part of maximum likelihood estimation or through the Kalman filter (i.e. recursive least squares). If time_varying_regression is True, this must be set to False.

True
simple_differencing bool

Whether or not to use partially conditional maximum likelihood estimation.

  • If True, differencing is performed prior to estimation, which discards the first s*D + d initial rows but results in a smaller state-space formulation.
  • If False, the full SARIMAX model is put in state-space form so that all data points can be used in estimation.
False
enforce_stationarity bool

Whether or not to transform the AR parameters to enforce stationarity in the autoregressive component of the model.

True
enforce_invertibility bool

Whether or not to transform the MA parameters to enforce invertibility in the moving average component of the model.

True
hamilton_representation bool

Whether or not to use the Hamilton representation of an ARMA process (if True) or the Harvey representation (if False).

False
concentrate_scale bool

Whether or not to concentrate the scale (variance of the error term) out of the likelihood. This reduces the number of parameters estimated by maximum likelihood by one, but standard errors will then not be available for the scale parameter.

False
trend_offset int

The offset at which to start time trend values. Default is 1, so that if trend='t' the trend is equal to 1, 2, ..., nobs. Typically is only set when the model created by extending a previous dataset.

1
use_exact_diffuse bool

Whether or not to use exact diffuse initialization for non-stationary states. Default is False (in which case approximate diffuse initialization is used).

False
method str

The method determines which solver from scipy.optimize is used, and it can be chosen from among the following strings:

  • 'newton' for Newton-Raphson
  • 'nm' for Nelder-Mead
  • 'bfgs' for Broyden-Fletcher-Goldfarb-Shanno (BFGS)
  • 'lbfgs' for limited-memory BFGS with optional box constraints
  • 'powell' for modified Powell`s method
  • 'cg' for conjugate gradient
  • 'ncg' for Newton-conjugate gradient
  • 'basinhopping' for global basin-hopping solver
'lbfgs'
maxiter int

The maximum number of iterations to perform.

50
start_params numpy ndarray

Initial guess of the solution for the loglikelihood maximization. If None, the default is given by estimator.start_params.

None
disp bool

Set to True to print convergence messages.

False
sm_init_kwargs dict

Additional keyword arguments to pass to the statsmodels SARIMAX model when it is initialized.

{}
sm_fit_kwargs dict

Additional keyword arguments to pass to the fit method of the statsmodels SARIMAX model. The statsmodels SARIMAX.fit parameters method, max_iter, start_params and disp have been moved to the initialization of this model and will have priority over those provided by the user using via sm_fit_kwargs.

{}
sm_predict_kwargs dict

Additional keyword arguments to pass to the get_forecast method of the statsmodels SARIMAXResults object.

{}

Attributes:

Name Type Description
order tuple

The (p,d,q) order of the model for the number of AR parameters, differences, and MA parameters.

seasonal_order tuple

The (P,D,Q,s) order of the seasonal component of the model for the AR parameters, differences, MA parameters, and periodicity.

trend str

Deterministic trend polynomial A(t).

measurement_error bool

Whether or not to assume the endogenous observations y were measured with error.

time_varying_regression bool

Used when an explanatory variables, exog, are provided to select whether or not coefficients on the exogenous estimators are allowed to vary over time.

mle_regression bool

Whether or not to use estimate the regression coefficients for the exogenous variables as part of maximum likelihood estimation or through the Kalman filter (i.e. recursive least squares). If time_varying_regression is True, this must be set to False.

simple_differencing bool

Whether or not to use partially conditional maximum likelihood estimation.

enforce_stationarity bool

Whether or not to transform the AR parameters to enforce stationarity in the autoregressive component of the model.

enforce_invertibility bool

Whether or not to transform the MA parameters to enforce invertibility in the moving average component of the model.

hamilton_representation bool

Whether or not to use the Hamilton representation of an ARMA process (if True) or the Harvey representation (if False).

concentrate_scale bool

Whether or not to concentrate the scale (variance of the error term) out of the likelihood. This reduces the number of parameters estimated by maximum likelihood by one, but standard errors will then not be available for the scale parameter.

trend_offset int

The offset at which to start time trend values.

use_exact_diffuse bool

Whether or not to use exact diffuse initialization for non-stationary states.

method str

The method determines which solver from scipy.optimize is used.

maxiter int

The maximum number of iterations to perform.

start_params numpy ndarray

Initial guess of the solution for the loglikelihood maximization.

disp bool

Set to True to print convergence messages.

sm_init_kwargs dict

Additional keyword arguments to pass to the statsmodels SARIMAX model when it is initialized.

sm_fit_kwargs dict

Additional keyword arguments to pass to the fit method of the statsmodels SARIMAX model.

sm_predict_kwargs dict

Additional keyword arguments to pass to the get_forecast method of the statsmodels SARIMAXResults object.

_sarimax_params dict

Parameters of this model that can be set with the set_params method.

output_type str

Format of the object returned by the predict method. This is set automatically according to the type of y used in the fit method to train the model, 'numpy' or 'pandas'.

sarimax object

The statsmodels.tsa.statespace.sarimax.SARIMAX object created.

is_fitted bool

Tag to identify if the estimator has been fitted (trained).

sarimax_res object

The resulting statsmodels.tsa.statespace.sarimax.SARIMAXResults object created by statsmodels after fitting the SARIMAX model.

training_index pandas Index

Index of the training series as long as it is a pandas Series or Dataframe.

estimator_name_ str

String identifier of the fitted model configuration (e.g., "Sarimax(1,1,1)(0,0,0)[1]"). This is updated after fitting to reflect the selected model.

References

.. [1] Statsmodels SARIMAX API Reference. https://www.statsmodels.org/stable/generated/statsmodels.tsa.statespace.sarimax.SARIMAX.html

.. [2] Statsmodels SARIMAXResults API Reference. https://www.statsmodels.org/stable/generated/statsmodels.tsa.statespace.sarimax.SARIMAXResults.html

Methods:

Name Description
fit

Fit the model to the data.

predict

Forecast future values and, if desired, their confidence intervals.

append

Recreate the results object with new data appended to the original data.

apply

Apply the fitted parameters to new data unrelated to the original data.

extend

Recreate the results object for new data that extends the original data.

set_params

Set new values to the parameters of the estimator.

get_params

Get the non trainable parameters of the estimator. This method

params

Get the parameters of the model. The order of variables is the trend

summary

Get a summary of the SARIMAXResults object.

get_info_criteria

Get the selected information criteria.

get_feature_importances

Get feature importances for SARIMAX statsmodels model.

Source code in skforecast\stats\_sarimax.py
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
def __init__(
    self,
    order: tuple = (1, 0, 0),
    seasonal_order: tuple = (0, 0, 0, 0),
    trend: str = None,
    measurement_error: bool = False,
    time_varying_regression: bool = False,
    mle_regression: bool = True,
    simple_differencing: bool = False,
    enforce_stationarity: bool = True,
    enforce_invertibility: bool = True,
    hamilton_representation: bool = False,
    concentrate_scale: bool = False,
    trend_offset: int = 1,
    use_exact_diffuse: bool = False,
    dates = None,
    freq = None,
    missing = 'none',
    validate_specification: bool = True,
    method: str = 'lbfgs',
    maxiter: int = 50,
    start_params: np.ndarray = None,
    disp: bool = False,
    sm_init_kwargs: dict[str, object] = {},
    sm_fit_kwargs: dict[str, object] = {},
    sm_predict_kwargs: dict[str, object] = {}
) -> None:

    self.order                   = order
    self.seasonal_order          = seasonal_order
    self.trend                   = trend
    self.measurement_error       = measurement_error
    self.time_varying_regression = time_varying_regression
    self.mle_regression          = mle_regression
    self.simple_differencing     = simple_differencing
    self.enforce_stationarity    = enforce_stationarity
    self.enforce_invertibility   = enforce_invertibility
    self.hamilton_representation = hamilton_representation
    self.concentrate_scale       = concentrate_scale
    self.trend_offset            = trend_offset
    self.use_exact_diffuse       = use_exact_diffuse
    self.dates                   = dates
    self.freq                    = freq
    self.missing                 = missing
    self.validate_specification  = validate_specification
    self.method                  = method
    self.maxiter                 = maxiter
    self.start_params            = start_params
    self.disp                    = disp

    # Create the dictionaries with the additional statsmodels parameters to be  
    # used during the init, fit and predict methods. Note that the statsmodels 
    # SARIMAX.fit parameters `method`, `max_iter`, `start_params` and `disp` 
    # have been moved to the initialization of this model and will have 
    # priority over those provided by the user using via `sm_fit_kwargs`.
    self.sm_init_kwargs    = sm_init_kwargs
    self.sm_fit_kwargs     = sm_fit_kwargs
    self.sm_predict_kwargs = sm_predict_kwargs

    # Params that can be set with the `set_params` method
    _, _, _, _sarimax_params = inspect.getargvalues(inspect.currentframe())
    self._sarimax_params = {
        k: v for k, v in _sarimax_params.items() 
        if k not in ['self', '_', '_sarimax_params']
    }

    self._consolidate_kwargs()

    # Create Results Attributes 
    self.output_type    = None
    self.sarimax        = None
    self.is_fitted      = False
    self.sarimax_res    = None
    self.training_index = None

    p, d, q = self.order
    P, D, Q, m = self.seasonal_order
    self.estimator_name_ = f"Sarimax({p},{d},{q})({P},{D},{Q})[{m}]"

order instance-attribute

order = order

seasonal_order instance-attribute

seasonal_order = seasonal_order

trend instance-attribute

trend = trend

measurement_error instance-attribute

measurement_error = measurement_error

time_varying_regression instance-attribute

time_varying_regression = time_varying_regression

mle_regression instance-attribute

mle_regression = mle_regression

simple_differencing instance-attribute

simple_differencing = simple_differencing

enforce_stationarity instance-attribute

enforce_stationarity = enforce_stationarity

enforce_invertibility instance-attribute

enforce_invertibility = enforce_invertibility

hamilton_representation instance-attribute

hamilton_representation = hamilton_representation

concentrate_scale instance-attribute

concentrate_scale = concentrate_scale

trend_offset instance-attribute

trend_offset = trend_offset

use_exact_diffuse instance-attribute

use_exact_diffuse = use_exact_diffuse

dates instance-attribute

dates = dates

freq instance-attribute

freq = freq

missing instance-attribute

missing = missing

validate_specification instance-attribute

validate_specification = validate_specification

method instance-attribute

method = method

maxiter instance-attribute

maxiter = maxiter

start_params instance-attribute

start_params = start_params

disp instance-attribute

disp = disp

sm_init_kwargs instance-attribute

sm_init_kwargs = sm_init_kwargs

sm_fit_kwargs instance-attribute

sm_fit_kwargs = sm_fit_kwargs

sm_predict_kwargs instance-attribute

sm_predict_kwargs = sm_predict_kwargs

_sarimax_params instance-attribute

_sarimax_params = {
    k: v
    for k, v in (items())
    if k not in ["self", "_", "_sarimax_params"]
}

output_type instance-attribute

output_type = None

sarimax instance-attribute

sarimax = None

is_fitted instance-attribute

is_fitted = False

sarimax_res instance-attribute

sarimax_res = None

training_index instance-attribute

training_index = None

estimator_name_ instance-attribute

estimator_name_ = f'Sarimax({p},{d},{q})({P},{D},{Q})[{m}]'

_consolidate_kwargs

_consolidate_kwargs()

Create the dictionaries to be used during the init, fit, and predict methods. Note that the parameters in this model's initialization take precedence over those provided by the user using via the statsmodels kwargs dicts.

Parameters:

Name Type Description Default
self
required

Returns:

Type Description
None
Source code in skforecast\stats\_sarimax.py
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
def _consolidate_kwargs(
    self
) -> None:
    """
    Create the dictionaries to be used during the init, fit, and predict methods.
    Note that the parameters in this model's initialization take precedence 
    over those provided by the user using via the statsmodels kwargs dicts.

    Parameters
    ----------
    self

    Returns
    -------
    None

    """

    # statsmodels.tsa.statespace.SARIMAX parameters
    _init_kwargs = self.sm_init_kwargs.copy()
    _init_kwargs.update({
       'order': self.order,
       'seasonal_order': self.seasonal_order,
       'trend': self.trend,
       'measurement_error': self.measurement_error,
       'time_varying_regression': self.time_varying_regression,
       'mle_regression': self.mle_regression,
       'simple_differencing': self.simple_differencing,
       'enforce_stationarity': self.enforce_stationarity,
       'enforce_invertibility': self.enforce_invertibility,
       'hamilton_representation': self.hamilton_representation,
       'concentrate_scale': self.concentrate_scale,
       'trend_offset': self.trend_offset,
       'use_exact_diffuse': self.use_exact_diffuse,
       'dates': self.dates,
       'freq': self.freq,
       'missing': self.missing,
       'validate_specification': self.validate_specification
    })
    self._init_kwargs = _init_kwargs

    # statsmodels.tsa.statespace.SARIMAX.fit parameters
    _fit_kwargs = self.sm_fit_kwargs.copy()
    _fit_kwargs.update({
       'method': self.method,
       'maxiter': self.maxiter,
       'start_params': self.start_params,
       'disp': self.disp,
    })        
    self._fit_kwargs = _fit_kwargs

    # statsmodels.tsa.statespace.SARIMAXResults.get_forecast parameters
    self._predict_kwargs = self.sm_predict_kwargs.copy()

_create_sarimax

_create_sarimax(endog, exog=None)

A helper method to create a new statsmodel SARIMAX model.

Additional keyword arguments to pass to the statsmodels SARIMAX model when it is initialized can be added with the init_kwargs argument when initializing the model.

Parameters:

Name Type Description Default
endog numpy ndarray, pandas Series, pandas DataFrame

The endogenous variable.

required
exog numpy ndarray, pandas Series, pandas DataFrame

The exogenous variables.

None

Returns:

Type Description
None
Source code in skforecast\stats\_sarimax.py
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
def _create_sarimax(
    self,
    endog: np.ndarray | pd.Series | pd.DataFrame,
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None
) -> None:
    """
    A helper method to create a new statsmodel SARIMAX model.

    Additional keyword arguments to pass to the statsmodels SARIMAX model 
    when it is initialized can be added with the `init_kwargs` argument 
    when initializing the model.

    Parameters
    ----------
    endog : numpy ndarray, pandas Series, pandas DataFrame
        The endogenous variable.
    exog : numpy ndarray, pandas Series, pandas DataFrame, default None
        The exogenous variables.

    Returns
    -------
    None

    """

    self.sarimax = SARIMAX(endog=endog, exog=exog, **self._init_kwargs)

fit

fit(y, exog=None)

Fit the model to the data.

Additional keyword arguments to pass to the fit method of the statsmodels SARIMAX model can be added with the fit_kwargs argument when initializing the model.

Parameters:

Name Type Description Default
y numpy ndarray, pandas Series, pandas DataFrame

Training time series.

required
exog numpy ndarray, pandas Series, pandas DataFrame

Exogenous variable/s included as predictor/s. Must have the same number of observations as y and their indexes must be aligned so that y[i] is regressed on exog[i].

None

Returns:

Type Description
None
Source code in skforecast\stats\_sarimax.py
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
def fit(
    self,
    y: np.ndarray | pd.Series | pd.DataFrame,
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None
) -> None:
    """
    Fit the model to the data.

    Additional keyword arguments to pass to the `fit` method of the
    statsmodels SARIMAX model can be added with the `fit_kwargs` argument 
    when initializing the model.

    Parameters
    ----------
    y : numpy ndarray, pandas Series, pandas DataFrame
        Training time series.
    exog : numpy ndarray, pandas Series, pandas DataFrame, default None
        Exogenous variable/s included as predictor/s. Must have the same
        number of observations as `y` and their indexes must be aligned so
        that y[i] is regressed on exog[i].

    Returns
    -------
    None

    """

    # Reset values in case the model has already been fitted.
    self.output_type    = None
    self.sarimax_res    = None
    self.is_fitted      = False
    self.training_index = None

    self.output_type = 'numpy' if isinstance(y, np.ndarray) else 'pandas'

    self._create_sarimax(endog=y, exog=exog)
    self.sarimax_res = self.sarimax.fit(**self._fit_kwargs)
    self.is_fitted = True

    if self.output_type == 'pandas':
        self.training_index = y.index

predict

predict(
    steps, exog=None, return_conf_int=False, alpha=0.05
)

Forecast future values and, if desired, their confidence intervals.

Generate predictions (forecasts) n steps in the future with confidence intervals. Note that if exogenous variables were used in the model fit, they will be expected for the predict procedure and will fail otherwise.

Additional keyword arguments to pass to the get_forecast method of the statsmodels SARIMAX model can be added with the predict_kwargs argument when initializing the model.

Parameters:

Name Type Description Default
steps int

Number of steps to predict.

required
exog numpy ndarray, pandas Series, pandas DataFrame

Value of the exogenous variable/s for the next steps. The number of observations needed is the number of steps to predict.

None
return_conf_int bool

Whether to get the confidence intervals of the forecasts.

False
alpha float

The confidence intervals for the forecasts are (1 - alpha) %.

0.05

Returns:

Name Type Description
predictions numpy ndarray, pandas DataFrame

Values predicted by the forecaster and their estimated interval. The output type is the same as the type of y used in the fit method.

  • pred: predictions.
  • lower_bound: lower bound of the interval. (if return_conf_int)
  • upper_bound: upper bound of the interval. (if return_conf_int)
Source code in skforecast\stats\_sarimax.py
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
@check_is_fitted
def predict(
    self,
    steps: int,
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None, 
    return_conf_int: bool = False,
    alpha: float = 0.05
) -> np.ndarray | pd.DataFrame:
    """
    Forecast future values and, if desired, their confidence intervals.

    Generate predictions (forecasts) n steps in the future with confidence
    intervals. Note that if exogenous variables were used in the model fit, 
    they will be expected for the predict procedure and will fail otherwise.

    Additional keyword arguments to pass to the `get_forecast` method of the
    statsmodels SARIMAX model can be added with the `predict_kwargs` argument 
    when initializing the model.

    Parameters
    ----------
    steps : int
        Number of steps to predict. 
    exog : numpy ndarray, pandas Series, pandas DataFrame, default None
        Value of the exogenous variable/s for the next steps. The number of 
        observations needed is the number of steps to predict. 
    return_conf_int : bool, default False
        Whether to get the confidence intervals of the forecasts.
    alpha : float, default 0.05
        The confidence intervals for the forecasts are (1 - alpha) %.

    Returns
    -------
    predictions : numpy ndarray, pandas DataFrame
        Values predicted by the forecaster and their estimated interval. The 
        output type is the same as the type of `y` used in the fit method.

        - pred: predictions.
        - lower_bound: lower bound of the interval. (if `return_conf_int`)
        - upper_bound: upper bound of the interval. (if `return_conf_int`)

    """

    # This is done because statsmodels doesn't allow `exog` length greater than
    # the number of steps
    if exog is not None and len(exog) > steps:
        warnings.warn(
            f"When predicting using exogenous variables, the `exog` parameter "
            f"must have the same length as the number of predicted steps. Since "
            f"len(exog) > steps, only the first {steps} observations are used."
        )
        exog = exog[:steps]

    predictions = self.sarimax_res.get_forecast(
                      steps = steps,
                      exog  = exog,
                      **self._predict_kwargs
                  )

    if not return_conf_int:
        predictions = predictions.predicted_mean
        if self.output_type == 'pandas':
            predictions = predictions.rename("pred").to_frame()
    else:
        if self.output_type == 'numpy':
            predictions = np.column_stack(
                              [predictions.predicted_mean,
                               predictions.conf_int(alpha=alpha)]
                          )
        else:
            predictions = pd.concat((
                              predictions.predicted_mean,
                              predictions.conf_int(alpha=alpha)),
                              axis = 1
                          )
            predictions.columns = ['pred', 'lower_bound', 'upper_bound']

    return predictions

append

append(
    y,
    exog=None,
    refit=False,
    copy_initialization=False,
    **kwargs
)

Recreate the results object with new data appended to the original data.

Creates a new result object applied to a dataset that is created by appending new data to the end of the model's original data [1]_. The new results can then be used for analysis or forecasting.

Parameters:

Name Type Description Default
y numpy ndarray, pandas Series, pandas DataFrame

New observations from the modeled time-series process.

required
exog numpy ndarray, pandas Series, pandas DataFrame

New observations of exogenous estimators, if applicable. Must have the same number of observations as y and their indexes must be aligned so that y[i] is regressed on exog[i].

None
refit bool

Whether to re-fit the parameters, based on the combined dataset.

False
copy_initialization bool

Whether or not to copy the initialization from the current results set to the new model.

False
**kwargs

Keyword arguments may be used to modify model specification arguments when created the new model object.

{}

Returns:

Type Description
None
Notes

The y and exog arguments to this method must be formatted in the same way (e.g. Pandas Series versus Numpy array) as were the y and exog arrays passed to the original model.

The y argument to this method should consist of new observations that occurred directly after the last element of y. For any other kind of dataset, see the apply method.

This method will apply filtering to all of the original data as well as to the new data. To apply filtering only to the new data (which can be much faster if the original dataset is large), see the extend method.

References

.. [1] Statsmodels MLEResults append API Reference. https://www.statsmodels.org/stable/generated/statsmodels.tsa.statespace.mlemodel.MLEResults.append.html#statsmodels.tsa.statespace.mlemodel.MLEResults.append

Source code in skforecast\stats\_sarimax.py
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
@check_is_fitted
def append(
    self,
    y: np.ndarray | pd.Series | pd.DataFrame,
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None,
    refit: bool = False,
    copy_initialization: bool = False,
    **kwargs
) -> None:
    """
    Recreate the results object with new data appended to the original data.

    Creates a new result object applied to a dataset that is created by 
    appending new data to the end of the model's original data [1]_. The new 
    results can then be used for analysis or forecasting.

    Parameters
    ----------
    y : numpy ndarray, pandas Series, pandas DataFrame
        New observations from the modeled time-series process.
    exog : numpy ndarray, pandas Series, pandas DataFrame, default None
        New observations of exogenous estimators, if applicable. Must have 
        the same number of observations as `y` and their indexes must be 
        aligned so that y[i] is regressed on exog[i].
    refit : bool, default False
        Whether to re-fit the parameters, based on the combined dataset.
    copy_initialization : bool, default False
        Whether or not to copy the initialization from the current results 
        set to the new model. 
    **kwargs
        Keyword arguments may be used to modify model specification arguments 
        when created the new model object.

    Returns
    -------
    None

    Notes
    -----
    The `y` and `exog` arguments to this method must be formatted in the same 
    way (e.g. Pandas Series versus Numpy array) as were the `y` and `exog` 
    arrays passed to the original model.

    The `y` argument to this method should consist of new observations that 
    occurred directly after the last element of `y`. For any other kind of 
    dataset, see the apply method.

    This method will apply filtering to all of the original data as well as 
    to the new data. To apply filtering only to the new data (which can be 
    much faster if the original dataset is large), see the extend method.

    References
    ----------
    .. [1] Statsmodels MLEResults append API Reference.
           https://www.statsmodels.org/stable/generated/statsmodels.tsa.statespace.mlemodel.MLEResults.append.html#statsmodels.tsa.statespace.mlemodel.MLEResults.append

    """

    fit_kwargs = self._fit_kwargs if refit else None

    self.sarimax_res = self.sarimax_res.append(
                           endog               = y,
                           exog                = exog,
                           refit               = refit,
                           copy_initialization = copy_initialization,
                           fit_kwargs          = fit_kwargs,
                           **kwargs
                       )

apply

apply(
    y,
    exog=None,
    refit=False,
    copy_initialization=False,
    **kwargs
)

Apply the fitted parameters to new data unrelated to the original data.

Creates a new result object using the current fitted parameters, applied to a completely new dataset that is assumed to be unrelated to the model's original data [1]_. The new results can then be used for analysis or forecasting.

Parameters:

Name Type Description Default
y numpy ndarray, pandas Series, pandas DataFrame

New observations from the modeled time-series process.

required
exog numpy ndarray, pandas Series, pandas DataFrame

New observations of exogenous estimators, if applicable. Must have the same number of observations as y and their indexes must be aligned so that y[i] is regressed on exog[i].

None
refit bool

Whether to re-fit the parameters, using the new dataset.

False
copy_initialization bool

Whether or not to copy the initialization from the current results set to the new model.

False
**kwargs

Keyword arguments may be used to modify model specification arguments when created the new model object.

{}

Returns:

Type Description
None
Notes

The y argument to this method should consist of new observations that are not necessarily related to the original model's y dataset. For observations that continue that original dataset by follow directly after its last element, see the append and extend methods.

References

.. [1] Statsmodels MLEResults apply API Reference. https://www.statsmodels.org/stable/generated/statsmodels.tsa.statespace.mlemodel.MLEResults.apply.html#statsmodels.tsa.statespace.mlemodel.MLEResults.apply

Source code in skforecast\stats\_sarimax.py
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
@check_is_fitted
def apply(
    self,
    y: np.ndarray | pd.Series | pd.DataFrame,
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None,
    refit: bool = False,
    copy_initialization: bool = False,
    **kwargs
) -> None:
    """
    Apply the fitted parameters to new data unrelated to the original data.

    Creates a new result object using the current fitted parameters, applied 
    to a completely new dataset that is assumed to be unrelated to the model's
    original data [1]_. The new results can then be used for analysis or forecasting.

    Parameters
    ----------
    y : numpy ndarray, pandas Series, pandas DataFrame
        New observations from the modeled time-series process.
    exog : numpy ndarray, pandas Series, pandas DataFrame, default None
        New observations of exogenous estimators, if applicable. Must have 
        the same number of observations as `y` and their indexes must be 
        aligned so that y[i] is regressed on exog[i].
    refit : bool, default False
        Whether to re-fit the parameters, using the new dataset.
    copy_initialization : bool, default False
        Whether or not to copy the initialization from the current results 
        set to the new model. 
    **kwargs
        Keyword arguments may be used to modify model specification arguments 
        when created the new model object.

    Returns
    -------
    None

    Notes
    -----
    The `y` argument to this method should consist of new observations that 
    are not necessarily related to the original model's `y` dataset. For 
    observations that continue that original dataset by follow directly after 
    its last element, see the append and extend methods.

    References
    ----------
    .. [1] Statsmodels MLEResults apply API Reference.
           https://www.statsmodels.org/stable/generated/statsmodels.tsa.statespace.mlemodel.MLEResults.apply.html#statsmodels.tsa.statespace.mlemodel.MLEResults.apply

    """

    fit_kwargs = self._fit_kwargs if refit else None

    self.sarimax_res = self.sarimax_res.apply(
                           endog               = y,
                           exog                = exog,
                           refit               = refit,
                           copy_initialization = copy_initialization,
                           fit_kwargs          = fit_kwargs,
                           **kwargs
                       )

extend

extend(y, exog=None, **kwargs)

Recreate the results object for new data that extends the original data.

Creates a new result object applied to a new dataset that is assumed to follow directly from the end of the model's original data [1]_. The new results can then be used for analysis or forecasting.

Parameters:

Name Type Description Default
y numpy ndarray, pandas Series, pandas DataFrame

New observations from the modeled time-series process.

required
exog numpy ndarray, pandas Series, pandas DataFrame

New observations of exogenous estimators, if applicable. Must have the same number of observations as y and their indexes must be aligned so that y[i] is regressed on exog[i].

None
**kwargs

Keyword arguments may be used to modify model specification arguments when created the new model object.

{}

Returns:

Type Description
None
Notes

The y argument to this method should consist of new observations that occurred directly after the last element of the model's original y array. For any other kind of dataset, see the apply method.

This method will apply filtering only to the new data provided by the y argument, which can be much faster than re-filtering the entire dataset. However, the returned results object will only have results for the new data. To retrieve results for both the new data and the original data, see the append method.

References

.. [1] Statsmodels MLEResults extend API Reference. https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.mlemodel.MLEResults.extend.html#statsmodels.tsa.statespace.mlemodel.MLEResults.extend

Source code in skforecast\stats\_sarimax.py
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
@check_is_fitted
def extend(
    self,
    y: np.ndarray | pd.Series | pd.DataFrame,
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None,
    **kwargs
) -> None:
    """
    Recreate the results object for new data that extends the original data.

    Creates a new result object applied to a new dataset that is assumed to 
    follow directly from the end of the model's original data [1]_. The new 
    results can then be used for analysis or forecasting.

    Parameters
    ----------
    y : numpy ndarray, pandas Series, pandas DataFrame
        New observations from the modeled time-series process.
    exog : numpy ndarray, pandas Series, pandas DataFrame, default None
        New observations of exogenous estimators, if applicable. Must have 
        the same number of observations as `y` and their indexes must be 
        aligned so that y[i] is regressed on exog[i].
    **kwargs
        Keyword arguments may be used to modify model specification arguments 
        when created the new model object.

    Returns
    -------
    None

    Notes
    -----
    The `y` argument to this method should consist of new observations that 
    occurred directly after the last element of the model's original `y` 
    array. For any other kind of dataset, see the apply method.

    This method will apply filtering only to the new data provided by the `y` 
    argument, which can be much faster than re-filtering the entire dataset. 
    However, the returned results object will only have results for the new 
    data. To retrieve results for both the new data and the original data, 
    see the append method.

    References
    ----------
    .. [1] Statsmodels MLEResults extend API Reference.
           https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.mlemodel.MLEResults.extend.html#statsmodels.tsa.statespace.mlemodel.MLEResults.extend

    """

    self.sarimax_res = self.sarimax_res.extend(
                           endog = y,
                           exog  = exog,
                           **kwargs
                       )

set_params

set_params(**params)

Set new values to the parameters of the estimator.

Parameters:

Name Type Description Default
params dict

Parameters values.

{}

Returns:

Type Description
None
Source code in skforecast\stats\_sarimax.py
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
def set_params(
    self, 
    **params: dict[str, object]
) -> None:
    """
    Set new values to the parameters of the estimator.

    Parameters
    ----------
    params : dict
        Parameters values.

    Returns
    -------
    None

    """

    params = {k: v for k, v in params.items() if k in self._sarimax_params}
    for key, value in params.items():
        setattr(self, key, value)
        self._sarimax_params[key] = value

    self._consolidate_kwargs()

    # Reset values in case the model has already been fitted.
    self.output_type    = None
    self.sarimax_res    = None
    self.is_fitted      = False
    self.training_index = None

get_params

get_params(deep=True)

Get the non trainable parameters of the estimator. This method is different from the params method, which returns the parameters of the fitted model.

Parameters:

Name Type Description Default
deep bool

If True, will return the parameters for this estimator and contained subobjects that are estimators.

True

Returns:

Name Type Description
params dict

Parameters of the estimator.

Source code in skforecast\stats\_sarimax.py
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
def get_params(
    self, 
    deep: bool = True
) -> dict[str, object]:
    """
    Get the non trainable parameters of the estimator. This method
    is different from the `params` method, which returns the parameters
    of the fitted model.

    Parameters
    ----------
    deep : bool, default True
        If `True`, will return the parameters for this estimator and 
        contained subobjects that are estimators.

    Returns
    -------
    params : dict
        Parameters of the estimator.

    """

    return self._sarimax_params.copy()

params

params()

Get the parameters of the model. The order of variables is the trend coefficients, the k_exog exogenous coefficients, the k_ar AR coefficients, and finally the k_ma MA coefficients.

Returns:

Name Type Description
params numpy ndarray, pandas Series

The parameters of the model.

Source code in skforecast\stats\_sarimax.py
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
@check_is_fitted
def params(
    self
) -> np.ndarray | pd.Series:
    """
    Get the parameters of the model. The order of variables is the trend
    coefficients, the `k_exog` exogenous coefficients, the `k_ar` AR 
    coefficients, and finally the `k_ma` MA coefficients.

    Returns
    -------
    params : numpy ndarray, pandas Series
        The parameters of the model.

    """

    return self.sarimax_res.params

summary

summary(alpha=0.05, start=None)

Get a summary of the SARIMAXResults object.

Parameters:

Name Type Description Default
alpha float

The confidence intervals for the forecasts are (1 - alpha) %.

0.05
start int

Integer of the start observation.

None

Returns:

Name Type Description
summary Summary instance

This holds the summary table and text, which can be printed or converted to various output formats.

Source code in skforecast\stats\_sarimax.py
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
@check_is_fitted
def summary(
    self,
    alpha: float = 0.05,
    start: int = None
) -> object:
    """
    Get a summary of the SARIMAXResults object.

    Parameters
    ----------
    alpha : float, default 0.05
        The confidence intervals for the forecasts are (1 - alpha) %.
    start : int, default None
        Integer of the start observation.

    Returns
    -------
    summary : Summary instance
        This holds the summary table and text, which can be printed or 
        converted to various output formats.

    """

    return self.sarimax_res.summary(alpha=alpha, start=start)

get_info_criteria

get_info_criteria(criteria='aic', method='standard')

Get the selected information criteria.

Check https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAXResults.info_criteria.html to know more about statsmodels info_criteria method.

Parameters:

Name Type Description Default
criteria str

The information criteria to compute. Valid options are {'aic', 'bic', 'hqic'}.

'aic'
method str

The method for information criteria computation. Default is 'standard' method; 'lutkepohl' computes the information criteria as in Lütkepohl (2007).

'standard'

Returns:

Name Type Description
metric float

The value of the selected information criteria.

Source code in skforecast\stats\_sarimax.py
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
@check_is_fitted
def get_info_criteria(
    self,
    criteria: str = 'aic',
    method: str = 'standard'
) -> float:
    """
    Get the selected information criteria.

    Check https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAXResults.info_criteria.html
    to know more about statsmodels info_criteria method.

    Parameters
    ----------
    criteria : str, default 'aic'
        The information criteria to compute. Valid options are {'aic', 'bic',
        'hqic'}.
    method : str, default 'standard'
        The method for information criteria computation. Default is 'standard'
        method; 'lutkepohl' computes the information criteria as in Lütkepohl
        (2007).

    Returns
    -------
    metric : float
        The value of the selected information criteria.

    """

    if criteria not in ['aic', 'bic', 'hqic']:
        raise ValueError(
            "Invalid value for `criteria`. Valid options are 'aic', 'bic', "
            "and 'hqic'."
        )

    if method not in ['standard', 'lutkepohl']:
        raise ValueError(
            "Invalid value for `method`. Valid options are 'standard' and "
            "'lutkepohl'."
        )

    metric = self.sarimax_res.info_criteria(criteria=criteria, method=method)

    return metric

get_feature_importances

get_feature_importances()

Get feature importances for SARIMAX statsmodels model.

Source code in skforecast\stats\_sarimax.py
841
842
843
844
845
846
847
848
@check_is_fitted
def get_feature_importances(self) -> pd.DataFrame:
    """Get feature importances for SARIMAX statsmodels model."""

    feature_importances = self.params().to_frame().reset_index()
    feature_importances.columns = ['feature', 'importance']

    return feature_importances

skforecast.stats._ets.Ets

Ets(
    m=1,
    model="ZZZ",
    damped=None,
    alpha=None,
    beta=None,
    gamma=None,
    phi=None,
    lambda_param=None,
    lambda_auto=False,
    bias_adjust=True,
    bounds="both",
    seasonal=True,
    trend=None,
    ic="aicc",
    allow_multiplicative=True,
    allow_multiplicative_trend=False,
)

Bases: BaseEstimator, RegressorMixin

Scikit-learn style wrapper for the ETS (Error, Trend, Seasonality) model.

This estimator treats a univariate time series as input. Call fit(y) with a 1D array-like of observations in time order, then produce out-of-sample forecasts via predict(steps) and prediction intervals via predict_interval(steps, level=...). In-sample diagnostics are available through fitted_, residuals_() and summary().

Parameters:

Name Type Description Default
m int

Seasonal period (e.g., 12 for monthly data with yearly seasonality).

1
model (str, None)

Three-letter model specification (e.g., "ANN", "AAA", "MAM"): - First letter: Error type (A=Additive, M=Multiplicative, Z=Auto) - Second letter: Trend type (N=None, A=Additive, M=Multiplicative, Z=Auto) - Third letter: Season type (N=None, A=Additive, M=Multiplicative, Z=Auto) Use "ZZZ" or None for automatic model selection.

"ZZZ"
damped bool or None

Whether to use damped trend. If None, both damped and non-damped models are tried (only when model="ZZZ" or model=None).

None
alpha float

Smoothing parameter for level (0 < alpha < 1). If None, estimated.

None
beta float

Smoothing parameter for trend (0 < beta < alpha). If None, estimated.

None
gamma float

Smoothing parameter for seasonality (0 < gamma < 1-alpha). If None, estimated.

None
phi float

Damping parameter (0 < phi < 1). If None, estimated.

None
lambda_param float

Box-Cox transformation parameter. If None, no transformation applied.

None
lambda_auto bool

If True, automatically select optimal Box-Cox lambda parameter.

False
bias_adjust bool

Apply bias adjustment when back-transforming forecasts.

True
bounds str

Parameter bounds type: "usual", "admissible", or "both".

"both"
seasonal bool

Allow seasonal models (only used with model="ZZZ" or model=None).

True
trend bool

Allow trend models. If None, automatically determined (only with model="ZZZ" or model=None).

None
ic ('aic', 'aicc', 'bic')

Information criterion for model selection (only with model="ZZZ" or model=None).

"aic"
allow_multiplicative bool

Allow multiplicative error and season models (only with model="ZZZ" or model=None).

True
allow_multiplicative_trend bool

Allow multiplicative trend models (only with model="ZZZ" or model=None).

False

Attributes:

Name Type Description
m int

Seasonal period (e.g., 12 for monthly data with yearly seasonality).

model str

Three-letter model specification (e.g., "ANN", "AAA", "MAM"). Each letter represents error, trend, and season types respectively, using A (Additive), M (Multiplicative), N (None), or Z (Auto-select).

damped bool or None

Whether to apply damping to the trend component. If None with model="ZZZ" or model=None, both damped and non-damped models are evaluated during automatic selection.

alpha float or None

User-provided smoothing parameter for the level component (0 < alpha < 1). When None, the parameter is estimated during fitting.

beta float or None

User-provided smoothing parameter for the trend component (0 < beta < alpha). When None, the parameter is estimated during fitting if trend is present.

gamma float or None

User-provided smoothing parameter for the seasonal component (0 < gamma < 1-alpha). When None, the parameter is estimated during fitting if seasonality is present.

phi float or None

User-provided damping parameter (0 < phi < 1). When None, the parameter is estimated during fitting if damped trend is used.

lambda_param float or None

Box-Cox transformation parameter applied to the time series before fitting. When None, no transformation is applied unless lambda_auto is True.

lambda_auto bool

Whether to automatically determine the optimal Box-Cox transformation parameter during model fitting.

bias_adjust bool

Whether to apply bias adjustment when back-transforming forecasts from the Box-Cox transformed scale to the original scale.

bounds str

Type of parameter bounds used during optimization: "usual" for standard bounds, "admissible" for stability-ensuring bounds, or "both" for their intersection.

seasonal bool

Whether seasonal models are considered during automatic model selection (only applicable when model="ZZZ" or model=None).

trend bool or None

Whether trend models are considered during automatic model selection. When None with model="ZZZ" or model=None, this is determined automatically based on the data.

ic {'aic', 'aicc', 'bic'}

Information criterion used to compare and select the best model during automatic model selection (only applicable when model="ZZZ" or model=None).

allow_multiplicative bool

Whether multiplicative error and seasonal components are allowed during automatic model selection (only applicable when model="ZZZ" or model=None).

allow_multiplicative_trend bool

Whether multiplicative trend components are allowed during automatic model selection (only applicable when model="ZZZ" or model=None).

model_ ETSModel or None

The fitted ETS model object containing parameters, diagnostics, and state space representation. Available after calling fit().

model_config_ dict or None

Dictionary containing the model configuration including error type, trend type, seasonal type, damping flag, and seasonal period. Available after calling fit().

params_ dict or None

Dictionary of estimated model parameters including smoothing parameters (alpha, beta, gamma, phi) and initial state values. Available after calling fit().

aic_ float or None

Akaike Information Criterion of the fitted model, measuring the quality of fit while penalizing model complexity. Available after calling fit().

bic_ float or None

Bayesian Information Criterion of the fitted model, similar to AIC but with a stronger penalty for model complexity. Available after calling fit().

y_train_ ndarray of shape (n_samples,) or None

The original training time series used to fit the model.

fitted_values_ ndarray of shape (n_samples,) or None

One-step-ahead in-sample fitted values from the model.

in_sample_residuals_ ndarray of shape (n_samples,) or None

In-sample residuals calculated as the difference between observed values and fitted values.

n_features_in_ int or None

Number of features (time series) seen during fit(). For ETS, this is always 1 as it handles univariate time series. Available after calling fit().

is_memory_reduced bool

Flag indicating whether reduce_memory() has been called to clear diagnostic arrays (y_train_, fitted_values_, in_sample_residuals_).

is_fitted bool

Flag indicating whether the model has been successfully fitted to data.

estimator_name_ str

String identifier of the fitted model configuration (e.g., "Ets(AAA)"). This is updated after fitting to reflect the selected model.

is_auto bool

Indicates whether automatic model selection was used (model="ZZZ" or model=None).

best_params_ dict or None

If automatic model selection was used (model="ZZZ" or model=None), this dictionary contains the parameters of the selected best model. Otherwise, it is None.

Methods:

Name Description
fit

Fit the ETS model to a univariate time series.

predict

Generate mean forecasts steps ahead.

predict_interval

Forecast with prediction intervals.

get_residuals

Get in-sample residuals (observed - fitted) from the ETS model.

get_fitted_values

Get in-sample fitted values from the ETS model.

get_score

R^2 using in-sample fitted values.

get_params

Get parameters for this estimator.

get_feature_importances

Get feature importances for Eta model.

get_info_criteria

Get information criteria.

set_params

Set the parameters of this estimator and reset the fitted state.

summary

Print a summary of the fitted ETS model.

reduce_memory

Reduce memory usage by removing internal arrays not needed for prediction.

Source code in skforecast\stats\_ets.py
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
def __init__(
    self,
    m: int = 1,
    model: str | None = "ZZZ",
    damped: bool | None = None,
    alpha: float | None = None,
    beta: float | None = None,
    gamma: float | None = None,
    phi: float | None = None,
    lambda_param: float | None = None,
    lambda_auto: bool = False,
    bias_adjust: bool = True,
    bounds: str = "both",
    seasonal: bool = True,
    trend: bool | None = None,
    ic: Literal["aic", "aicc", "bic"] = "aicc",
    allow_multiplicative: bool = True,
    allow_multiplicative_trend: bool = False,
):

    if not isinstance(m, int) or m < 1:
        raise ValueError(
            f"`m` must be a positive integer greater than or equal to 1. "
            f"Got {m}."
        )

    self.m                          = m
    self.model                      = model if model is not None else "ZZZ"
    self.damped                     = damped
    self.alpha                      = alpha
    self.beta                       = beta
    self.gamma                      = gamma
    self.phi                        = phi
    self.lambda_param               = lambda_param
    self.lambda_auto                = lambda_auto
    self.bias_adjust                = bias_adjust
    self.bounds                     = bounds
    self.seasonal                   = seasonal
    self.trend                      = trend
    self.ic                         = ic
    self.allow_multiplicative       = allow_multiplicative
    self.allow_multiplicative_trend = allow_multiplicative_trend

    self.model_                     = None
    self.model_config_              = None
    self.params_                    = None
    self.aic_                       = None
    self.bic_                       = None
    self.y_train_                   = None
    self.fitted_values_             = None
    self.in_sample_residuals_       = None
    self.n_features_in_             = None
    self.is_memory_reduced          = False
    self.is_fitted                  = False
    self.best_params_               = None
    self.is_auto                    = self.model == "ZZZ"

    if self.is_auto:
        self.estimator_name_ = "AutoEts()"
    else:
        self.estimator_name_ = f"Ets({self.model})"

m instance-attribute

m = m

model instance-attribute

model = model if model is not None else 'ZZZ'

damped instance-attribute

damped = damped

alpha instance-attribute

alpha = alpha

beta instance-attribute

beta = beta

gamma instance-attribute

gamma = gamma

phi instance-attribute

phi = phi

lambda_param instance-attribute

lambda_param = lambda_param

lambda_auto instance-attribute

lambda_auto = lambda_auto

bias_adjust instance-attribute

bias_adjust = bias_adjust

bounds instance-attribute

bounds = bounds

seasonal instance-attribute

seasonal = seasonal

trend instance-attribute

trend = trend

ic instance-attribute

ic = ic

allow_multiplicative instance-attribute

allow_multiplicative = allow_multiplicative

allow_multiplicative_trend instance-attribute

allow_multiplicative_trend = allow_multiplicative_trend

model_ instance-attribute

model_ = None

model_config_ instance-attribute

model_config_ = None

params_ instance-attribute

params_ = None

aic_ instance-attribute

aic_ = None

bic_ instance-attribute

bic_ = None

y_train_ instance-attribute

y_train_ = None

fitted_values_ instance-attribute

fitted_values_ = None

in_sample_residuals_ instance-attribute

in_sample_residuals_ = None

n_features_in_ instance-attribute

n_features_in_ = None

is_memory_reduced instance-attribute

is_memory_reduced = False

is_fitted instance-attribute

is_fitted = False

best_params_ instance-attribute

best_params_ = None

is_auto instance-attribute

is_auto = model == 'ZZZ'

estimator_name_ instance-attribute

estimator_name_ = 'AutoEts()'

fit

fit(y, exog=None)

Fit the ETS model to a univariate time series.

Parameters:

Name Type Description Default
y array-like of shape (n_samples,)

Time-ordered numeric sequence.

required
exog Ignored

Exogenous variables. Ignored, present for API compatibility.

None

Returns:

Name Type Description
self Ets

Fitted estimator.

Source code in skforecast\stats\_ets.py
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
def fit(self, y: pd.Series | np.ndarray, exog: None = None) -> Ets:
    """
    Fit the ETS model to a univariate time series.

    Parameters
    ----------
    y : array-like of shape (n_samples,)
        Time-ordered numeric sequence.
    exog : Ignored
        Exogenous variables. Ignored, present for API compatibility.

    Returns
    -------
    self : Ets
        Fitted estimator.

    """

    self.model_               = None
    self.model_config_        = None
    self.params_              = None
    self.aic_                 = None
    self.bic_                 = None
    self.y_train_             = None
    self.fitted_values_       = None
    self.in_sample_residuals_ = None
    self.n_features_in_       = None
    self.is_memory_reduced    = False
    self.is_fitted            = False
    self.best_params_         = None

    if not isinstance(y, (pd.Series, np.ndarray)):
        raise ValueError("`y` must be a pandas Series or numpy ndarray.")

    y = np.asarray(y, dtype=np.float64)
    if y.ndim == 2 and y.shape[1] == 1:
        # Allow (n, 1) shaped arrays and squeeze to 1D
        y = y.ravel()
    elif y.ndim != 1:
        raise ValueError("`y` must be a 1D array-like sequence.")
    if len(y) < 1:
        raise ValueError("`y` is too short to fit ETS model.")

    # Automatic model selection
    if self.model == "ZZZ":
        self.model_ = auto_ets(
            y,
            m                          = self.m,
            seasonal                   = self.seasonal,
            trend                      = self.trend,
            damped                     = self.damped,
            ic                         = self.ic,
            allow_multiplicative       = self.allow_multiplicative,
            allow_multiplicative_trend = self.allow_multiplicative_trend,
            lambda_auto                = self.lambda_auto,
            verbose                    = False,
        )

        self.best_params_ = {
            "m": self.model_.config.m,
            "model": f"{self.model_.config.error}{self.model_.config.trend}{self.model_.config.season}",
            "damped": self.model_.config.damped,
            "alpha": self.model_.params.alpha,
            "beta": self.model_.params.beta,
            "gamma": self.model_.params.gamma,
            "phi": self.model_.params.phi,
            "lambda_param": self.lambda_param,
            "lambda_auto": self.lambda_auto,
            "bias_adjust": self.bias_adjust,
            "bounds": self.bounds,
            "seasonal": self.seasonal,
            "trend": self.trend,
            "ic": self.ic,
            "allow_multiplicative": self.allow_multiplicative,
            "allow_multiplicative_trend": self.allow_multiplicative_trend,
        }

    else:
        # Fit specific model
        damped_param = False if self.damped is None else self.damped
        self.model_ = ets(
            y,
            m            = self.m,
            model        = self.model,
            damped       = damped_param,
            alpha        = self.alpha,
            beta         = self.beta,
            gamma        = self.gamma,
            phi          = self.phi,
            lambda_param = self.lambda_param,
            lambda_auto  = self.lambda_auto,
            bias_adjust  = self.bias_adjust,
            bounds       = self.bounds,
        )

    # Extract model attributes (use references to avoid duplicating arrays)
    self.model_config_        = asdict(self.model_.config)
    self.params_              = asdict(self.model_.params)
    self.aic_                 = self.model_.aic
    self.bic_                 = self.model_.bic
    self.y_train_             = self.model_.y_original
    self.fitted_values_       = self.model_.fitted
    self.in_sample_residuals_ = self.model_.residuals
    self.n_features_in_       = 1
    self.is_fitted            = True

    model_name = f"{self.model_config_['error']}{self.model_config_['trend']}{self.model_config_['season']}"
    if self.model_config_['damped'] and self.model_config_['trend'] != "N":
        model_name = f"{self.model_config_['error']}{self.model_config_['trend']}d{self.model_config_['season']}"

    self.estimator_name_ = f"Ets({model_name})"

    return self

predict

predict(steps, exog=None)

Generate mean forecasts steps ahead.

Parameters:

Name Type Description Default
steps int

Forecast horizon (must be > 0).

required
exog None

Exogenous variables. Ignored, present for API compatibility.

None

Returns:

Name Type Description
predictions ndarray of shape (steps,)

Point forecasts for steps 1..h.

Source code in skforecast\stats\_ets.py
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
@check_is_fitted
def predict(self, steps: int, exog: None = None) -> np.ndarray:
    """
    Generate mean forecasts steps ahead.

    Parameters
    ----------
    steps : int
        Forecast horizon (must be > 0).
    exog : None
        Exogenous variables. Ignored, present for API compatibility.

    Returns
    -------
    predictions : ndarray of shape (steps,)
        Point forecasts for steps 1..h.

    """

    if not isinstance(steps, (int, np.integer)) or steps <= 0:
        raise ValueError("`steps` must be a positive integer.")

    predictions = forecast_ets(
        self.model_,
        h           = steps,
        bias_adjust = self.bias_adjust,
        level       = None
    )
    return predictions["mean"]

predict_interval

predict_interval(
    steps=1, level=(80, 95), as_frame=True, exog=None
)

Forecast with prediction intervals.

Parameters:

Name Type Description Default
steps int

Forecast horizon.

1
level list or tuple of float

Confidence levels in percent.

(80, 95)
as_frame bool

If True, return a tidy DataFrame with columns 'mean', 'lower_', 'upper_' for each level L. If False, return a NumPy ndarray.

True
exog Ignored

Exogenous variables. Ignored, present for API compatibility.

None

Returns:

Name Type Description
predictions numpy ndarray, pandas DataFrame

If as_frame=True, pandas DataFrame with columns 'mean', 'lower_', 'upper_' for each level L. If as_frame=False, numpy ndarray.

Source code in skforecast\stats\_ets.py
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
@check_is_fitted
def predict_interval(
    self,
    steps: int = 1,
    level: list[float] | tuple[float, ...] = (80, 95),
    as_frame: bool = True,
    exog: Any = None,
) -> np.ndarray | pd.DataFrame:
    """
    Forecast with prediction intervals.

    Parameters
    ----------
    steps : int, default 1
        Forecast horizon.
    level : list or tuple of float, default (80, 95)
        Confidence levels in percent.
    as_frame : bool, default True
        If True, return a tidy DataFrame with columns 'mean', 'lower_<L>',
        'upper_<L>' for each level L. If False, return a NumPy ndarray.
    exog : Ignored
        Exogenous variables. Ignored, present for API compatibility.

    Returns
    -------
    predictions : numpy ndarray, pandas DataFrame
        If as_frame=True, pandas DataFrame with columns 'mean', 'lower_<L>',
        'upper_<L>' for each level L. If as_frame=False, numpy ndarray.

    """

    if not isinstance(steps, (int, np.integer)) or steps <= 0:
        raise ValueError("`steps` must be a positive integer.")

    raw_preds = forecast_ets(
        self.model_,
        h           = steps,
        bias_adjust = self.bias_adjust,
        level       = list(level)
    )

    levels = list(level) if level is not None else []
    n_levels = len(levels)
    mean = np.asarray(raw_preds["mean"])

    predictions = np.empty((steps, 1 + 2 * n_levels), dtype=float)
    predictions[:, 0] = mean
    for i, lv in enumerate(levels):
        lv_int = int(lv)
        lower_key = f"lower_{lv_int}"
        upper_key = f"upper_{lv_int}"
        lower_arr = np.asarray(raw_preds[lower_key])
        upper_arr = np.asarray(raw_preds[upper_key])
        predictions[:, 1 + 2 * i] = lower_arr
        predictions[:, 1 + 2 * i + 1] = upper_arr

    if as_frame:
        col_names = ["mean"]
        for level in levels:
            level = int(level)
            col_names.append(f"lower_{level}")
            col_names.append(f"upper_{level}")

        predictions = pd.DataFrame(
            predictions, columns=col_names, index=pd.RangeIndex(1, steps + 1, name="step")
        )

    return predictions

get_residuals

get_residuals()

Get in-sample residuals (observed - fitted) from the ETS model.

Returns:

Name Type Description
residuals ndarray of shape (n_samples,)
Source code in skforecast\stats\_ets.py
459
460
461
462
463
464
465
466
467
468
469
470
471
@check_is_fitted
def get_residuals(self) -> np.ndarray:
    """
    Get in-sample residuals (observed - fitted) from the ETS model.

    Returns
    -------
    residuals : ndarray of shape (n_samples,)

    """

    check_memory_reduced(self, method_name='get_residuals')
    return self.in_sample_residuals_

get_fitted_values

get_fitted_values()

Get in-sample fitted values from the ETS model.

Returns:

Name Type Description
fitted ndarray of shape (n_samples,)
Source code in skforecast\stats\_ets.py
473
474
475
476
477
478
479
480
481
482
483
484
485
@check_is_fitted
def get_fitted_values(self) -> np.ndarray:
    """
    Get in-sample fitted values from the ETS model.

    Returns
    -------
    fitted : ndarray of shape (n_samples,)

    """

    check_memory_reduced(self, method_name='get_fitted_values')
    return self.fitted_values_

get_score

get_score(y=None)

R^2 using in-sample fitted values.

Parameters:

Name Type Description Default
y Ignored

Present for API compatibility.

None

Returns:

Name Type Description
score float

Coefficient of determination.

Source code in skforecast\stats\_ets.py
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
@check_is_fitted
def get_score(self, y: Any = None) -> float:
    """
    R^2 using in-sample fitted values.

    Parameters
    ----------
    y : Ignored
        Present for API compatibility.

    Returns
    -------
    score : float
        Coefficient of determination.

    """

    check_memory_reduced(self, method_name='get_score')

    y = self.y_train_
    fitted = self.fitted_values_

    # Handle NaN values if any
    mask = ~(np.isnan(y) | np.isnan(fitted))
    if mask.sum() < 2:
        return float("nan")

    ss_res = np.sum((y[mask] - fitted[mask]) ** 2)
    ss_tot = np.sum((y[mask] - y[mask].mean()) ** 2) + np.finfo(float).eps

    return 1.0 - ss_res / ss_tot

get_params

get_params(deep=True)

Get parameters for this estimator.

Parameters:

Name Type Description Default
deep bool

If True, will return the parameters for this estimator and contained subobjects that are estimators.

True

Returns:

Name Type Description
params dict

Parameter names mapped to their values.

Source code in skforecast\stats\_ets.py
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
def get_params(self, deep: bool = True) -> dict:
    """
    Get parameters for this estimator.

    Parameters
    ----------
    deep : bool, default True
        If True, will return the parameters for this estimator and
        contained subobjects that are estimators.

    Returns
    -------
    params : dict
        Parameter names mapped to their values.

    """

    return {
        "m": self.m,
        "model": self.model,
        "damped": self.damped,
        "alpha": self.alpha,
        "beta": self.beta,
        "gamma": self.gamma,
        "phi": self.phi,
        "seasonal": self.seasonal,
        "trend": self.trend,
        "allow_multiplicative": self.allow_multiplicative,
        "allow_multiplicative_trend": self.allow_multiplicative_trend,
    }

get_feature_importances

get_feature_importances()

Get feature importances for Eta model.

Source code in skforecast\stats\_ets.py
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
@check_is_fitted
def get_feature_importances(self) -> pd.DataFrame:
    """Get feature importances for Eta model."""
    features = ['alpha (level)']
    importances = [self.params_['alpha']]

    if self.model_config_['trend'] != 'N':
        features.append('beta (trend)')
        importances.append(self.params_['beta'])

    if self.model_config_['season'] != 'N':
        features.append('gamma (seasonal)')
        importances.append(self.params_['gamma'])

    if self.model_config_['damped']:
        features.append('phi (damping)')
        importances.append(self.params_['phi'])

    return pd.DataFrame({
        'feature': features,
        'importance': importances
    })

get_info_criteria

get_info_criteria(criteria)

Get information criteria.

Parameters:

Name Type Description Default
criteria str

Information criterion to retrieve. Valid options are 'aic' and 'bic'.

required

Returns:

Name Type Description
info_criteria float

Value of the requested information criterion.

Source code in skforecast\stats\_ets.py
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
@check_is_fitted
def get_info_criteria(self, criteria: str) -> float:
    """
    Get information criteria.

    Parameters
    ----------
    criteria : str
        Information criterion to retrieve. Valid options are 'aic' and 'bic'.
    Returns
    -------
    info_criteria : float
        Value of the requested information criterion.

    """
    if criteria not in {'aic', 'bic'}:
        raise ValueError(
            "Invalid value for `criteria`. Valid options are 'aic' and 'bic' "
            "for ETS model."
        )

    if criteria == 'aic':
        value = self.aic_
    elif criteria == 'bic':
        value = self.bic_

    return value

_set_params

_set_params(**params)

Set the parameters of this estimator. Internal method without resetting the fitted state. This method is intended for internal use only, please use set_params() instead.

Parameters:

Name Type Description Default
**params dict

Estimator parameters.

{}

Returns:

Type Description
None
Source code in skforecast\stats\_ets.py
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
def _set_params(self, **params) -> None:
    """
    Set the parameters of this estimator. Internal method without resetting 
    the fitted state. This method is intended for internal use only, please 
    use `set_params()` instead.

    Parameters
    ----------
    **params : dict
        Estimator parameters.

    Returns
    -------
    None

    """

    for key, value in params.items():
        setattr(self, key, value)

    self.is_auto = self.model is None or self.model == "ZZZ"
    if self.is_auto:
        self.model = "ZZZ"
        estimator_name_ = "AutoEts()"
    else:
        estimator_name_ = f"Ets({self.model})"

    self.estimator_name_ = estimator_name_

set_params

set_params(**params)

Set the parameters of this estimator and reset the fitted state.

This method resets the estimator to its unfitted state whenever parameters are changed, requiring the model to be refitted before making predictions.

Parameters:

Name Type Description Default
**params dict

Estimator parameters. Valid parameter keys are: 'm', 'model', 'damped', 'alpha', 'beta', 'gamma', 'phi', 'lambda_param', 'lambda_auto', 'bias_adjust', 'bounds', 'seasonal', 'trend', 'ic', 'allow_multiplicative', 'allow_multiplicative_trend'.

{}

Returns:

Name Type Description
self Ets

The estimator with updated parameters and reset state.

Raises:

Type Description
ValueError

If any parameter key is invalid.

Source code in skforecast\stats\_ets.py
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
def set_params(self, **params) -> Ets:
    """
    Set the parameters of this estimator and reset the fitted state.

    This method resets the estimator to its unfitted state whenever parameters
    are changed, requiring the model to be refitted before making predictions.

    Parameters
    ----------
    **params : dict
        Estimator parameters. Valid parameter keys are: 'm', 'model', 'damped',
        'alpha', 'beta', 'gamma', 'phi', 'lambda_param', 'lambda_auto',
        'bias_adjust', 'bounds', 'seasonal', 'trend', 'ic', 'allow_multiplicative',
        'allow_multiplicative_trend'.

    Returns
    -------
    self : Ets
        The estimator with updated parameters and reset state.

    Raises
    ------
    ValueError
        If any parameter key is invalid.

    """

    valid_params = {
        'm', 'model', 'damped', 'alpha', 'beta', 'gamma', 'phi',
        'lambda_param', 'lambda_auto', 'bias_adjust', 'bounds',
        'seasonal', 'trend', 'ic', 'allow_multiplicative',
        'allow_multiplicative_trend'
    }
    for key in params.keys():
        if key not in valid_params:
            raise ValueError(
                f"Invalid parameter '{key}' for estimator {self.__class__.__name__}. "
                f"Valid parameters are: {sorted(valid_params)}"
            )

    self._set_params(**params)

    # Reset fitted state - model needs to be refitted with new parameters
    self.model_               = None
    self.model_config_        = None
    self.params_              = None
    self.y_train_             = None
    self.fitted_values_       = None
    self.in_sample_residuals_ = None
    self.n_features_in_       = None
    self.is_memory_reduced    = False
    self.is_fitted            = False
    self.best_params_         = None

    return self

summary

summary()

Print a summary of the fitted ETS model.

Source code in skforecast\stats\_ets.py
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
@check_is_fitted
def summary(self) -> None:
    """
    Print a summary of the fitted ETS model.
    """

    print("ETS Model Summary")
    print("=" * 60)
    print(f"Model: {self.estimator_name_}")
    print(f"Seasonal period (m): {self.model_config_['m']}")
    print()

    print("Smoothing parameters:")
    print(f"  alpha (level):       {self.params_['alpha']:.4f}")
    if self.model_config_['trend'] != "N":
        print(f"  beta (trend):        {self.params_['beta']:.4f}")
    if self.model_config_['season'] != "N":
        print(f"  gamma (seasonal):    {self.params_['gamma']:.4f}")
    if self.model_config_['damped']:
        print(f"  phi (damping):       {self.params_['phi']:.4f}")
    print()

    print("Initial states:")
    print(f"  Level (l0):          {self.params_['init_states'][0]:.4f}")
    if self.model_config_['trend'] != "N" and len(self.params_['init_states']) > 1:
        print(f"  Trend (b0):          {self.params_['init_states'][1]:.4f}")
    print()

    print("Model fit statistics:")
    print(f"  sigma^2:             {self.model_.sigma2:.6f}")
    print(f"  Log-likelihood:      {self.model_.loglik:.2f}")
    print(f"  AIC:                 {self.aic_:.2f}")
    print(f"  BIC:                 {self.bic_:.2f}")
    print()

    if not self.is_memory_reduced:
        print("Residual statistics:")
        print(f"  Mean:                {np.mean(self.in_sample_residuals_):.6f}")
        print(f"  Std Dev:             {np.std(self.in_sample_residuals_, ddof=1):.6f}")
        print(f"  MAE:                 {np.mean(np.abs(self.in_sample_residuals_)):.6f}")
        print(f"  RMSE:                {np.sqrt(np.mean(self.in_sample_residuals_**2)):.6f}")
        print()

        print("Time Series Summary Statistics:")
        print(f"Number of observations: {len(self.y_train_)}")
        print(f"  Mean:                 {np.mean(self.y_train_):.4f}")
        print(f"  Std Dev:              {np.std(self.y_train_, ddof=1):.4f}")
        print(f"  Min:                  {np.min(self.y_train_):.4f}")
        print(f"  25%:                  {np.percentile(self.y_train_, 25):.4f}")
        print(f"  Median:               {np.median(self.y_train_):.4f}")
        print(f"  75%:                  {np.percentile(self.y_train_, 75):.4f}")
        print(f"  Max:                  {np.max(self.y_train_):.4f}")

reduce_memory

reduce_memory()

Reduce memory usage by removing internal arrays not needed for prediction. This method clears memory-heavy arrays that are only needed for diagnostics but not for prediction. After calling this method, the following methods will raise an error:

  • fitted_(): In-sample fitted values
  • residuals_(): In-sample residuals
  • score(): R² coefficient
  • summary(): Model summary statistics

Prediction methods remain fully functional:

  • predict(): Point forecasts
  • predict_interval(): Prediction intervals

Returns:

Name Type Description
self Ets

The estimator with reduced memory usage.

Source code in skforecast\stats\_ets.py
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
@check_is_fitted
def reduce_memory(self) -> Ets:
    """
    Reduce memory usage by removing internal arrays not needed for prediction.
    This method clears memory-heavy arrays that are only needed for diagnostics
    but not for prediction. After calling this method, the following methods
    will raise an error:

    - fitted_(): In-sample fitted values
    - residuals_(): In-sample residuals
    - score(): R² coefficient
    - summary(): Model summary statistics

    Prediction methods remain fully functional:

    - predict(): Point forecasts
    - predict_interval(): Prediction intervals

    Returns
    -------
    self : Ets
        The estimator with reduced memory usage.

    """

    # Clear arrays at Ets level
    self.y_train_ = None
    self.fitted_values_ = None
    self.in_sample_residuals_ = None

    # Clear arrays at ETSModel level
    if hasattr(self, 'model_'):
        self.model_.fitted = None
        self.model_.residuals = None
        self.model_.y_original = None

    self.is_memory_reduced = True

    return self

skforecast.stats._arar.Arar

Arar(max_ar_depth=None, max_lag=None, safe=True)

Bases: BaseEstimator, RegressorMixin

Scikit-learn style wrapper for the ARAR time-series model.

This estimator treats a univariate sequence as "the feature". Call fit(y) with a 1D array-like of observations in time order, then produce out-of-sample forecasts via predict(steps) and prediction intervals via predict_interval(steps, level=...). In-sample diagnostics are available through fitted_, residuals_() and summary().

Parameters:

Name Type Description Default
max_ar_depth int

Maximum AR depth considered for the (1, i, j, k) AR selection stage.

None
max_lag int

Maximum lag used when estimating autocovariances.

None
safe bool

If True, falls back to a mean-only model on numerical issues or very short series; otherwise errors are raised.

True

Attributes:

Name Type Description
max_ar_depth int or None

Maximum AR depth considered for the (1, i, j, k) AR selection stage during model fitting. When None, a default value is determined automatically based on the series length.

max_lag int or None

Maximum lag used when estimating autocovariances during the memory-shortening step. When None, a default value is determined automatically based on the series length.

safe bool

Whether to use safe mode. When True, the model falls back to a mean-only forecast on numerical issues or very short series. When False, errors are raised instead.

model_ tuple or None

Raw tuple returned by the underlying ARAR algorithm containing: (Y, best_phi, best_lag, sigma2, psi, sbar, max_ar_depth, max_lag). Available after calling fit().

coef_ ndarray of shape (4,) or None

Estimated AR coefficients for the selected lags (1, i, j, k). Some coefficients may be zero if the corresponding lag was not selected. Available after calling fit().

lags_ tuple or None

Selected lag indices (1, i, j, k) used in the AR model, where each represents which past observations contribute to the forecast. Available after calling fit().

sigma2_ float or None

Estimated innovation variance (one-step-ahead forecast error variance) from the fitted ARAR model. Available after calling fit().

psi_ ndarray or None

Memory-shortening filter coefficients used to transform the original series into one with shorter memory before AR fitting. Available after calling fit().

sbar_ float or None

Mean of the memory-shortened series, used as the long-run mean in forecasting. Available after calling fit().

aic_ float or None

Akaike Information Criterion measuring model fit quality while penalizing complexity. For models with exogenous variables, this is an approximate calculation that treats the two-step procedure (regression + ARAR) as independent stages, which may underestimate total model complexity. Available after calling fit().

bic_ float or None

Bayesian Information Criterion, similar to AIC but with a stronger penalty for model complexity. For models with exogenous variables, this is an approximate calculation that treats the two-step procedure (regression + ARAR) as independent stages, which may underestimate total model complexity. Available after calling fit().

exog_model_ FastLinearRegression or None

Fitted linear regression model for exogenous variables. When exogenous variables are provided during fitting, this model captures their linear relationship with the target series. Available after calling fit() with exogenous variables.

coef_exog_ ndarray of shape (n_exog_features,) or None

Coefficients from the exogenous variables regression model, excluding the intercept. Available after calling fit() with exogenous variables.

n_exog_features_in_ int or None

Number of exogenous features used during fitting. Zero if no exogenous variables were provided. Available after calling fit().

y_train_ ndarray of shape (n_samples,) or None

Original training time series used to fit the model.

fitted_values_ ndarray of shape (n_samples,) or None

One-step-ahead in-sample fitted values. The first k-1 values may be NaN where k is the largest lag used.

in_sample_residuals_ ndarray of shape (n_samples,) or None

In-sample residuals calculated as the difference between observed values and fitted values.

n_features_in_ int or None

Number of features (time series) seen during fit(). For ARAR, this is always 1 as it handles univariate time series (present for scikit-learn compatibility). Available after calling fit().

is_memory_reduced bool

Flag indicating whether reduce_memory() has been called to clear diagnostic arrays (y_train_, fitted_values_, in_sample_residuals_).

is_fitted bool

Flag indicating whether the model has been successfully fitted to data.

estimator_name_ str

String identifier of the fitted model configuration (e.g., "Arar(lags=[1,2,3])"). This is updated after fitting to reflect the selected model.

Notes

When exogenous variables are provided during fitting, the model uses a two-step approach (regression followed by ARAR on residuals). In this approach, the target series is first regressed on the exogenous variables using a linear regression model. The residuals from this regression, representing the portion of the series not explained by the exogenous variables, are then modeled using the ARAR model.

This design allows the influence of exogenous variables to be incorporated prior to applying the ARAR model, rather than within the ARAR dynamics themselves.

This two-step approach is necessary because the ARAR model is inherently univariate and does not natively support exogenous variables. By separating the regression step, the method preserves the original ARAR formulation while still capturing the effects of external predictors.

However, this approach carries important assumptions and implications:

  • The relationship between the target series and the exogenous variables is assumed to be linear and time-invariant.
  • The ARAR model is applied only to the residual process, meaning its parameters describe the dynamics of the series after removing the contribution of exogenous variables.
  • As a result, the interpretability of the ARAR parameters changes: they no longer describe the full data-generating process, but rather the behavior of the unexplained component.

Despite these limitations, this strategy provides a practical and computationally efficient way to incorporate exogenous information into an otherwise univariate ARAR framework.

Methods:

Name Description
fit

Fit the ARAR model to a univariate time series.

predict

Generate mean forecasts steps ahead.

predict_interval

Forecast with symmetric normal-theory prediction intervals.

get_residuals

Get in-sample residuals (observed - fitted) from the ARAR model.

get_fitted_values

Get in-sample fitted values from the ARAR model.

get_score

R^2 using in-sample fitted values (ignores initial NaNs).

get_params

Get parameters for this estimator.

get_feature_importances

Get feature importances for Arar model.

get_info_criteria

Get information criteria.

set_params

Set the parameters of this estimator and reset the fitted state.

summary

Print a simple textual summary of the fitted Arar model.

reduce_memory

Reduce memory usage by removing internal arrays not needed for prediction.

Source code in skforecast\stats\_arar.py
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
def __init__(
    self, 
    max_ar_depth: int | None = None, 
    max_lag: int | None = None, 
    safe: bool = True
):
    self.max_ar_depth           = max_ar_depth
    self.max_lag                = max_lag
    self.safe                   = safe
    self.lags_                  = None
    self.sigma2_                = None
    self.psi_                   = None
    self.sbar_                  = None

    self.model_                 = None
    self.coef_                  = None
    self.aic_                   = None
    self.bic_                   = None
    self.exog_model_            = None
    self.coef_exog_             = None
    self.n_exog_features_in_    = None
    self.y_train_               = None
    self.fitted_values_         = None
    self.in_sample_residuals_   = None
    self.n_features_in_         = None
    self.is_memory_reduced      = False
    self.is_fitted              = False
    self.estimator_name_        = "Arar()"

max_ar_depth instance-attribute

max_ar_depth = max_ar_depth

max_lag instance-attribute

max_lag = max_lag

safe instance-attribute

safe = safe

lags_ instance-attribute

lags_ = None

sigma2_ instance-attribute

sigma2_ = None

psi_ instance-attribute

psi_ = None

sbar_ instance-attribute

sbar_ = None

model_ instance-attribute

model_ = None

coef_ instance-attribute

coef_ = None

aic_ instance-attribute

aic_ = None

bic_ instance-attribute

bic_ = None

exog_model_ instance-attribute

exog_model_ = None

coef_exog_ instance-attribute

coef_exog_ = None

n_exog_features_in_ instance-attribute

n_exog_features_in_ = None

y_train_ instance-attribute

y_train_ = None

fitted_values_ instance-attribute

fitted_values_ = None

in_sample_residuals_ instance-attribute

in_sample_residuals_ = None

n_features_in_ instance-attribute

n_features_in_ = None

is_memory_reduced instance-attribute

is_memory_reduced = False

is_fitted instance-attribute

is_fitted = False

estimator_name_ instance-attribute

estimator_name_ = 'Arar()'

fit

fit(y, exog=None, suppress_warnings=False)

Fit the ARAR model to a univariate time series.

Parameters:

Name Type Description Default
y array-like of shape (n_samples,)

Time-ordered numeric sequence.

required
exog Series, DataFrame, or ndarray of shape (n_samples, n_exog_features)

Exogenous variables to include in the model. See Notes section for details on how exogenous variables are handled.

None
suppress_warnings bool

If True, suppresses the warning about exogenous variables affecting model interpretation.

False

Returns:

Name Type Description
self Arar

Fitted estimator.

Notes

When exogenous variables are provided during fitting, the model uses a two-step approach (regression followed by ARAR on residuals). In this approach, the target series is first regressed on the exogenous variables using a linear regression model. The residuals from this regression, representing the portion of the series not explained by the exogenous variables, are then modeled using the ARAR model.

This design allows the influence of exogenous variables to be incorporated prior to applying the ARAR model, rather than within the ARAR dynamics themselves.

This two-step approach is necessary because the ARAR model is inherently univariate and does not natively support exogenous variables. By separating the regression step, the method preserves the original ARAR formulation while still capturing the effects of external predictors.

However, this approach carries important assumptions and implications:

  • The relationship between the target series and the exogenous variables is assumed to be linear and time-invariant.
  • The ARAR model is applied only to the residual process, meaning its parameters describe the dynamics of the series after removing the contribution of exogenous variables.
  • As a result, the interpretability of the ARAR parameters changes: they no longer describe the full data-generating process, but rather the behavior of the unexplained component.

Despite these limitations, this strategy provides a practical and computationally efficient way to incorporate exogenous information into an otherwise univariate ARAR framework.

Source code in skforecast\stats\_arar.py
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
def fit(
    self, 
    y: np.ndarray | pd.Series, 
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None,
    suppress_warnings: bool = False
) -> "Arar":
    """
    Fit the ARAR model to a univariate time series.

    Parameters
    ----------
    y : array-like of shape (n_samples,)
        Time-ordered numeric sequence.
    exog : Series, DataFrame, or ndarray of shape (n_samples, n_exog_features), default None
        Exogenous variables to include in the model. See Notes section for details
        on how exogenous variables are handled.
    suppress_warnings : bool, default False
        If True, suppresses the warning about exogenous variables affecting model
        interpretation.

    Returns
    -------
    self : Arar
        Fitted estimator.

    Notes
    -----
    When exogenous variables are provided during fitting, the model uses a
    two-step approach (regression followed by ARAR on residuals). In this
    approach, the target series is first regressed on the exogenous variables
    using a linear regression model. The residuals from this regression,
    representing the portion of the series not explained by the exogenous
    variables, are then modeled using the ARAR model.

    This design allows the influence of exogenous variables to be incorporated
    prior to applying the ARAR model, rather than within the ARAR dynamics
    themselves.

    This two-step approach is necessary because the ARAR model is inherently
    univariate and does not natively support exogenous variables. By separating
    the regression step, the method preserves the original ARAR formulation
    while still capturing the effects of external predictors.

    However, this approach carries important assumptions and implications:

    - The relationship between the target series and the exogenous variables is
    assumed to be linear and time-invariant.
    - The ARAR model is applied only to the residual process, meaning its
    parameters describe the dynamics of the series after removing the
    contribution of exogenous variables.
    - As a result, the interpretability of the ARAR parameters changes: they no
    longer describe the full data-generating process, but rather the behavior
    of the unexplained component.

    Despite these limitations, this strategy provides a practical and
    computationally efficient way to incorporate exogenous information into an
    otherwise univariate ARAR framework.

    """

    self.lags_                = None
    self.sigma2_              = None
    self.psi_                 = None
    self.sbar_                = None

    self.model_               = None
    self.coef_                = None
    self.aic_                 = None
    self.bic_                 = None
    self.exog_model_          = None
    self.coef_exog_           = None
    self.n_exog_features_in_  = None
    self.y_train_             = None
    self.fitted_values_       = None
    self.in_sample_residuals_ = None
    self.n_features_in_       = None
    self.is_memory_reduced    = False
    self.is_fitted            = False

    if not isinstance(y, (pd.Series, np.ndarray)):
        raise TypeError("`y` must be a pandas Series or numpy ndarray.")

    if not isinstance(exog, (type(None), pd.Series, pd.DataFrame, np.ndarray)):
        raise TypeError("`exog` must be None, a pandas Series, pandas DataFrame, or numpy ndarray.")

    y = np.asarray(y, dtype=float)
    if y.ndim == 2 and y.shape[1] == 1:
        y = y.ravel()
    elif y.ndim != 1:
        raise ValueError("`y` must be a 1D array-like sequence.")

    series_to_arar = y

    if exog is not None:
        if not suppress_warnings:
            warnings.warn(
                "Exogenous variables are being handled using a two-step approach: "
                "(1) linear regression on exog, (2) ARAR on residuals. "
                "This affects model interpretation:\n"
                "  - ARAR coefficients (coef_) describe residual dynamics, not the original series\n"
                "  - Pred intervals reflect only ARAR uncertainty, not exog regression uncertainty\n"
                "  - Assumes a linear, time-invariant relationship between exog and target\n"
                "For more details, see the fit() method's Notes section of ARAR class. ",
                ExogenousInterpretationWarning
            )

        exog = np.asarray(exog, dtype=float)
        if exog.ndim == 1:
            exog = exog.reshape(-1, 1)
        elif exog.ndim != 2:
            raise ValueError("`exog` must be 1D or 2D.")

        if len(exog) != len(y):
            raise ValueError(f"Length of exog ({len(exog)}) must match length of y ({len(y)})")

        self.exog_model_ = FastLinearRegression()
        self.exog_model_.fit(exog, y)
        self.coef_exog_ = self.exog_model_.coef_
        series_to_arar = y - self.exog_model_.predict(exog)

    if series_to_arar.size < 2 and not self.safe:
        raise ValueError("Series too short to fit ARAR when safe=False.")

    self.model_ = arar(
        series_to_arar, max_ar_depth=self.max_ar_depth, max_lag=self.max_lag, safe=self.safe
    )

    (Y, best_phi, best_lag, sigma2, psi, sbar, max_ar_depth, max_lag) = self.model_

    self.max_ar_depth        = max_ar_depth
    self.max_lag             = max_lag
    self.lags_               = tuple(best_lag)
    self.sigma2_             = float(sigma2)
    self.psi_                = np.asarray(psi, dtype=float)
    self.sbar_               = float(sbar)
    self.coef_               = np.asarray(best_phi, dtype=float)
    self.y_train_            = y
    self.n_exog_features_in_ = exog.shape[1] if exog is not None else 0
    self.n_features_in_      = 1       
    self.is_memory_reduced   = False
    self.is_fitted           = True

    arar_fitted = fitted_arar(self.model_)["fitted"]
    if self.exog_model_ is not None:
        exog_fitted = self.exog_model_.predict(exog)
        self.fitted_values_ = exog_fitted + arar_fitted
    else:
        self.fitted_values_ = arar_fitted

    # Residuals: original y minus fitted values
    self.in_sample_residuals_ = y - self.fitted_values_

    # Compute AIC and BIC
    # Note: For models with exogenous variables, this is an approximate calculation
    # that treats the two-step procedure (regression + ARAR) as independent stages.
    # This may underestimate model complexity. Use these criteria primarily for
    # comparing models with the same exogenous structure.
    largest_lag = max(self.lags_)
    valid_residuals = self.in_sample_residuals_[largest_lag:]
    # Remove NaN values for AIC/BIC calculation
    valid_residuals = valid_residuals[~np.isnan(valid_residuals)]
    n = len(valid_residuals)
    if n > 0:
        # Count parameters:
        # - ARAR: 4 AR coefficients + 1 mean parameter (sbar) + 1 variance (sigma2) = 6
        # - Exog: n_exog coefficients + 1 intercept (if exog present)
        # Note: We count all 4 AR coefficients even if some are zero, as they were
        # selected during model fitting. The variance parameter sigma2 is also estimated.
        k_arar = 6  # 4 AR coefficients + sbar + sigma2
        k_exog = (self.n_exog_features_in_ + 1) if self.exog_model_ is not None else 0  # +1 for intercept
        k = k_arar + k_exog
        sigma2 = max(np.sum(valid_residuals ** 2) / n, 1e-12)  # Ensure positive
        loglik = -0.5 * n * (np.log(2 * np.pi) + np.log(sigma2) + 1)
        self.aic_ = -2 * loglik + 2 * k
        self.bic_ = -2 * loglik + k * np.log(n)
    else:
        self.aic_ = np.nan
        self.bic_ = np.nan

    self.estimator_name_ = f"Arar(lags={self.lags_})"

    return self

predict

predict(steps, exog=None)

Generate mean forecasts steps ahead.

Parameters:

Name Type Description Default
steps int

Forecast horizon (must be > 0)

required
exog ndarray, Series or DataFrame of shape (steps, n_exog_features)

Exogenous variables for prediction.

None

Returns:

Name Type Description
predictions ndarray of shape (h,)

Point forecasts for steps 1..h.

Source code in skforecast\stats\_arar.py
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
@check_is_fitted
def predict(
    self, 
    steps: int, 
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None
) -> np.ndarray:
    """
    Generate mean forecasts steps ahead.

    Parameters
    ----------
    steps : int
        Forecast horizon (must be > 0)
    exog : ndarray, Series or DataFrame of shape (steps, n_exog_features), default None
        Exogenous variables for prediction.

    Returns
    -------
    predictions : ndarray of shape (h,)
        Point forecasts for steps 1..h.

    """

    if not isinstance(steps, (int, np.integer)) or steps <= 0:
        raise ValueError("`steps` must be a positive integer.")

    # Forecast ARAR component
    predictions = forecast(self.model_, h=steps)["mean"]

    if self.exog_model_ is None and exog is not None:
        raise ValueError(
            "Model was fitted without exog, but `exog` was provided for prediction. "
            "Please refit the model with exogenous variables."
        )

    if self.exog_model_ is not None:
        if exog is None:
            raise ValueError("Model was fitted with exog, so `exog` is required for prediction.")
        exog = np.asarray(exog, dtype=float)
        if exog.ndim == 1:
            exog = exog.reshape(-1, 1)
        elif exog.ndim != 2:
            raise ValueError("`exog` must be 1D or 2D.")

        # Check feature consistency
        if exog.shape[1] != self.n_exog_features_in_:
            raise ValueError(f"Mismatch in exogenous features: fitted with {self.n_exog_features_in_}, got {exog.shape[1]}.")

        if len(exog) != steps:
            raise ValueError(f"Length of exog ({len(exog)}) must match steps ({steps}).")

        # Forecast Regression component
        exog_pred = self.exog_model_.predict(exog)
        predictions = predictions + exog_pred

    return predictions

predict_interval

predict_interval(
    steps=1, level=(80, 95), as_frame=True, exog=None
)

Forecast with symmetric normal-theory prediction intervals.

Parameters:

Name Type Description Default
steps int

Forecast horizon.

1
level iterable of int

Confidence levels in percent.

(80, 95)
as_frame bool

If True, return a tidy DataFrame with columns 'mean', 'lower_', 'upper_' for each level L. If False, return a NumPy ndarray.

True
exog ndarray, Series or DataFrame of shape (steps, n_exog_features)

Exogenous variables for prediction.

None

Returns:

Name Type Description
predictions numpy ndarray, pandas DataFrame

If as_frame=True, pandas DataFrame with columns 'mean', 'lower_', 'upper_' for each level L. If as_frame=False, numpy ndarray.

Notes

When exogenous variables are used, prediction intervals account only for ARAR forecast uncertainty and do not include uncertainty from the regression coefficients. This may result in undercoverage (actual coverage < nominal level).

Source code in skforecast\stats\_arar.py
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
@check_is_fitted
def predict_interval(
    self,
    steps: int = 1,
    level=(80, 95),
    as_frame: bool = True,
    exog: np.ndarray | pd.Series | pd.DataFrame | None = None
) -> np.ndarray | pd.DataFrame:
    """
    Forecast with symmetric normal-theory prediction intervals.

    Parameters
    ----------
    steps : int, default 1
        Forecast horizon.
    level : iterable of int, default (80, 95)
        Confidence levels in percent.
    as_frame : bool, default True
        If True, return a tidy DataFrame with columns 'mean', 'lower_<L>',
        'upper_<L>' for each level L. If False, return a NumPy ndarray.
    exog : ndarray, Series or DataFrame of shape (steps, n_exog_features), default None
        Exogenous variables for prediction.

    Returns
    -------
    predictions : numpy ndarray, pandas DataFrame
        If as_frame=True, pandas DataFrame with columns 'mean', 'lower_<L>',
        'upper_<L>' for each level L. If as_frame=False, numpy ndarray.

    Notes
    -----
    When exogenous variables are used, prediction intervals account only for 
    ARAR forecast uncertainty and do not include uncertainty from the regression 
    coefficients. This may result in **undercoverage** (actual coverage < nominal level).

    """

    if not isinstance(steps, (int, np.integer)) or steps <= 0:
        raise ValueError("`steps` must be a positive integer.")

    raw_preds = forecast(self.model_, h=steps, level=level)

    if self.exog_model_ is None and exog is not None:
        raise ValueError(
            "Model was fitted without exog, but `exog` was provided for prediction. "
            "Please refit the model with exogenous variables."
        )

    if self.exog_model_ is not None:
        if exog is None:
            raise ValueError("Model was fitted with exog, so `exog` is required for prediction.")
        exog = np.asarray(exog, dtype=float)
        if exog.ndim == 1:
            exog = exog.reshape(-1, 1)
        elif exog.ndim != 2:
            raise ValueError("`exog` must be 1D or 2D.")

        # Check feature consistency
        if exog.shape[1] != self.n_exog_features_in_:
            raise ValueError(
                f"Mismatch in exogenous features: fitted with {self.n_exog_features_in_}, "
                f"got {exog.shape[1]}.")

        if len(exog) != steps:
            raise ValueError(f"Length of exog ({len(exog)}) must match steps ({steps}).")

        exog_pred = self.exog_model_.predict(exog)

        raw_preds["mean"] = raw_preds["mean"] + exog_pred
        # Broadcast the exog prediction across confidence columns
        raw_preds["upper"] = raw_preds["upper"] + exog_pred[:, np.newaxis]
        raw_preds["lower"] = raw_preds["lower"] + exog_pred[:, np.newaxis]

    levels = raw_preds["level"]
    n_levels = len(levels)
    cols = [raw_preds["mean"]]
    for i in range(n_levels):
        cols.append(raw_preds["lower"][:, i])
        cols.append(raw_preds["upper"][:, i])

    predictions = np.column_stack(cols)

    if as_frame:
        col_names = ["mean"]
        for level in levels:
            level = int(level)
            col_names.append(f"lower_{level}")
            col_names.append(f"upper_{level}")

        predictions = pd.DataFrame(
            predictions, columns=col_names, index=pd.RangeIndex(1, steps + 1, name="step")
        )

    return predictions

get_residuals

get_residuals()

Get in-sample residuals (observed - fitted) from the ARAR model.

Returns:

Name Type Description
residuals ndarray of shape (n_samples,)
Source code in skforecast\stats\_arar.py
533
534
535
536
537
538
539
540
541
542
543
544
545
@check_is_fitted
def get_residuals(self) -> np.ndarray:
    """
    Get in-sample residuals (observed - fitted) from the ARAR model.

    Returns
    -------
    residuals : ndarray of shape (n_samples,)

    """

    check_memory_reduced(self, method_name='get_residuals')
    return self.in_sample_residuals_

get_fitted_values

get_fitted_values()

Get in-sample fitted values from the ARAR model.

Returns:

Name Type Description
fitted ndarray of shape (n_samples,)
Source code in skforecast\stats\_arar.py
547
548
549
550
551
552
553
554
555
556
557
558
559
@check_is_fitted
def get_fitted_values(self) -> np.ndarray:
    """
    Get in-sample fitted values from the ARAR model.

    Returns
    -------
    fitted : ndarray of shape (n_samples,)

    """

    check_memory_reduced(self, method_name='get_fitted_values')
    return self.fitted_values_

get_score

get_score(y=None)

R^2 using in-sample fitted values (ignores initial NaNs).

Parameters:

Name Type Description Default
y ignored

Present for API compatibility.

None

Returns:

Name Type Description
score float

Coefficient of determination.

Source code in skforecast\stats\_arar.py
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
@check_is_fitted
def get_score(self, y: Any = None) -> float:
    """
    R^2 using in-sample fitted values (ignores initial NaNs).

    Parameters
    ----------
    y : ignored
        Present for API compatibility.

    Returns
    -------
    score : float
        Coefficient of determination.

    """

    check_memory_reduced(self, method_name='get_score')

    y = self.y_train_
    fitted = self.fitted_values_

    mask = ~np.isnan(fitted)
    if mask.sum() < 2:
        return float("nan")
    ss_res = np.sum((y[mask] - fitted[mask]) ** 2)
    ss_tot = np.sum((y[mask] - y[mask].mean()) ** 2) + np.finfo(float).eps

    return 1.0 - ss_res / ss_tot

get_params

get_params(deep=True)

Get parameters for this estimator.

Parameters:

Name Type Description Default
deep bool

If True, will return the parameters for this estimator and contained subobjects that are estimators.

True

Returns:

Name Type Description
params dict

Parameter names mapped to their values.

Source code in skforecast\stats\_arar.py
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
def get_params(self, deep: bool = True) -> dict:
    """
    Get parameters for this estimator.

    Parameters
    ----------
    deep : bool, default True
        If True, will return the parameters for this estimator and
        contained subobjects that are estimators.

    Returns
    -------
    params : dict
        Parameter names mapped to their values.

    """

    return {
        "max_ar_depth": self.max_ar_depth,
        "max_lag": self.max_lag,
        "safe": self.safe
    }

get_feature_importances

get_feature_importances()

Get feature importances for Arar model.

Source code in skforecast\stats\_arar.py
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
@check_is_fitted
def get_feature_importances(self) -> pd.DataFrame:
    """Get feature importances for Arar model."""
    importances = pd.DataFrame({
        'feature': [f'lag_{lag}' for lag in self.lags_],
        'importance': self.coef_
    })

    if self.coef_exog_ is not None:
        exog_importances = pd.DataFrame({
            'feature': [f'exog_{i}' for i in range(self.coef_exog_.shape[0])],
            'importance': self.coef_exog_
        })
        importances = pd.concat([importances, exog_importances], ignore_index=True)
        warnings.warn(
                "Exogenous variables are being handled using a two-step approach: "
                "(1) linear regression on exog, (2) ARAR on residuals. "
                "This affects model interpretation:\n"
                "  - ARAR coefficients (coef_) describe residual dynamics, not the original series\n"
                "  - Exogenous coefficients (coef_exog_) describe exogenous impact on original series",
            ExogenousInterpretationWarning
        )

    return importances

get_info_criteria

get_info_criteria(criteria)

Get information criteria.

Parameters:

Name Type Description Default
criteria str

Information criterion to retrieve. Valid options are 'aic' and 'bic'.

required

Returns:

Name Type Description
info_criteria float

Value of the requested information criterion.

Source code in skforecast\stats\_arar.py
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
@check_is_fitted
def get_info_criteria(self, criteria: str) -> float:
    """
    Get information criteria.

    Parameters
    ----------
    criteria : str
        Information criterion to retrieve. Valid options are 'aic' and 'bic'.
    Returns
    -------
    info_criteria : float
        Value of the requested information criterion.

    """
    if criteria not in {'aic', 'bic'}:
        raise ValueError(
            "Invalid value for `criteria`. Valid options are 'aic' and 'bic' "
            "for ARAR model."
        )

    if criteria == 'aic':
        value = self.aic_
    else:
        value = self.bic_

    return value

set_params

set_params(**params)

Set the parameters of this estimator and reset the fitted state.

This method resets the estimator to its unfitted state whenever parameters are changed, requiring the model to be refitted before making predictions.

Parameters:

Name Type Description Default
**params dict

Estimator parameters. Valid parameter keys are 'max_ar_depth', 'max_lag', and 'safe'.

{}

Returns:

Type Description
Arar

The estimator with updated parameters and reset state.

Source code in skforecast\stats\_arar.py
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
def set_params(self, **params) -> "Arar":
    """
    Set the parameters of this estimator and reset the fitted state.

    This method resets the estimator to its unfitted state whenever parameters
    are changed, requiring the model to be refitted before making predictions.

    Parameters
    ----------
    **params : dict
        Estimator parameters. Valid parameter keys are 'max_ar_depth', 'max_lag',
        and 'safe'.

    Returns
    -------
    Arar
        The estimator with updated parameters and reset state.

    """

    valid_params = {'max_ar_depth', 'max_lag', 'safe'}
    for key in params.keys():
        if key not in valid_params:
            raise ValueError(
                f"Invalid parameter '{key}' for estimator {self.__class__.__name__}. "
                f"Valid parameters are: {valid_params}"
            )

    for key, value in params.items():
        setattr(self, key, value)

    # Reset fitted state
    self.lags_                  = None
    self.sigma2_                = None
    self.psi_                   = None
    self.sbar_                  = None

    self.model_                 = None
    self.coef_                  = None
    self.aic_                   = None
    self.bic_                   = None
    self.exog_model_            = None
    self.coef_exog_             = None
    self.n_exog_features_in_    = None
    self.y_train_               = None
    self.fitted_values_         = None
    self.in_sample_residuals_   = None
    self.n_features_in_         = None
    self.is_memory_reduced      = False
    self.is_fitted              = False
    self.estimator_name_        = "Arar()"

    return self

summary

summary()

Print a simple textual summary of the fitted Arar model.

Source code in skforecast\stats\_arar.py
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
@check_is_fitted
def summary(self) -> None:
    """
    Print a simple textual summary of the fitted Arar model.
    """

    print(f"{self.estimator_name_} Model Summary")
    print("------------------")
    print(f"Selected AR lags:                         {self.lags_}")
    print(f"AR coefficients (phi):                    {np.round(self.coef_, 4)}")
    print(f"Residual variance (sigma^2):              {self.sigma2_:.4f}")
    print(f"Mean of shortened series (sbar):          {self.sbar_:.4f}")
    print(f"Length of memory-shortening filter (psi): {len(self.psi_)}")

    if not self.is_memory_reduced:
        print("\nTime Series Summary Statistics")
        print(f"Number of observations: {len(self.y_train_)}")
        print(f"Mean:                   {np.mean(self.y_train_):.4f}")
        print(f"Std Dev:                {np.std(self.y_train_, ddof=1):.4f}")
        print(f"Min:                    {np.min(self.y_train_):.4f}")
        print(f"25%:                    {np.percentile(self.y_train_, 25):.4f}")
        print(f"Median:                 {np.median(self.y_train_):.4f}")
        print(f"75%:                    {np.percentile(self.y_train_, 75):.4f}")
        print(f"Max:                    {np.max(self.y_train_):.4f}")

    print("\nModel Diagnostics")
    print(f"AIC: {self.aic_:.4f}")
    print(f"BIC: {self.bic_:.4f}")

    if self.exog_model_ is not None:
        print("\nExogenous Model (Linear Regression)")
        print("-----------------------------------")
        print(f"Number of features: {self.n_exog_features_in_}")
        print(f"Intercept: {self.exog_model_.intercept_:.4f}")
        print(f"Coefficients: {np.round(self.exog_model_.coef_, 4)}")

reduce_memory

reduce_memory()

Reduce memory usage by removing internal arrays not needed for prediction. This method clears memory-heavy arrays that are only needed for diagnostics but not for prediction. After calling this method, the following methods will raise an error:

  • fitted_(): In-sample fitted values
  • residuals_(): In-sample residuals
  • score(): R² coefficient
  • summary(): Model summary statistics

Prediction methods remain fully functional:

  • predict(): Point forecasts
  • predict_interval(): Prediction intervals

Returns:

Name Type Description
self Arar

The estimator with reduced memory usage.

Source code in skforecast\stats\_arar.py
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
@check_is_fitted
def reduce_memory(self) -> "Arar":
    """
    Reduce memory usage by removing internal arrays not needed for prediction.
    This method clears memory-heavy arrays that are only needed for diagnostics
    but not for prediction. After calling this method, the following methods
    will raise an error:

    - fitted_(): In-sample fitted values
    - residuals_(): In-sample residuals
    - score(): R² coefficient
    - summary(): Model summary statistics

    Prediction methods remain fully functional:

    - predict(): Point forecasts
    - predict_interval(): Prediction intervals

    Returns
    -------
    self : Arar
        The estimator with reduced memory usage.

    """

    self.fitted_values_ = None
    self.in_sample_residuals_ = None

    self.is_memory_reduced = True

    return self