Skip to content

Sarimax

Sarimax(order=(1, 0, 0), seasonal_order=(0, 0, 0, 0), trend=None, measurement_error=False, time_varying_regression=False, mle_regression=True, simple_differencing=False, enforce_stationarity=True, enforce_invertibility=True, hamilton_representation=False, concentrate_scale=False, trend_offset=1, use_exact_diffuse=False, dates=None, freq=None, missing='none', validate_specification=True, method='lbfgs', maxiter=50, start_params=None, disp=False, sm_init_kwargs={}, sm_fit_kwargs={}, sm_predict_kwargs={})

Bases: BaseEstimator, RegressorMixin

A universal scikit-learn style wrapper for statsmodels SARIMAX.

This class wraps the statsmodels.tsa.statespace.sarimax.SARIMAX model to follow the scikit-learn style. The following docstring is based on the statmodels documentation and it is highly recommended to visit their site for the best level of detail.

https://www.statsmodels.org/stable/generated/statsmodels.tsa.statespace.sarimax.SARIMAX.html https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAXResults.html

Parameters:

Name Type Description Default
order tuple

The (p,d,q) order of the model for the number of AR parameters, differences, and MA parameters.

  • d must be an integer indicating the integration order of the process.
  • p and q may either be an integers indicating the AR and MA orders (so that all lags up to those orders are included) or else iterables giving specific AR and / or MA lags to include.
`(1, 0, 0)`
seasonal_order tuple

The (P,D,Q,s) order of the seasonal component of the model for the AR parameters, differences, MA parameters, and periodicity.

  • D must be an integer indicating the integration order of the process.
  • P and Q may either be an integers indicating the AR and MA orders (so that all lags up to those orders are included) or else iterables giving specific AR and / or MA lags to include.
  • s is an integer giving the periodicity (number of periods in season), often it is 4 for quarterly data or 12 for monthly data.
`(0, 0, 0, 0)`
trend str

Parameter controlling the deterministic trend polynomial A(t).

  • 'c' indicates a constant (i.e. a degree zero component of the trend polynomial).
  • 't' indicates a linear trend with time.
  • 'ct' indicates both, 'c' and 't'.
  • Can also be specified as an iterable defining the non-zero polynomial exponents to include, in increasing order. For example, [1,1,0,1] denotes a + b*t + ct^3.
`None`
measurement_error bool

Whether or not to assume the endogenous observations y were measured with error.

`False`
time_varying_regression bool

Used when an explanatory variables, exog, are provided to select whether or not coefficients on the exogenous regressors are allowed to vary over time.

`False`
mle_regression bool

Whether or not to use estimate the regression coefficients for the exogenous variables as part of maximum likelihood estimation or through the Kalman filter (i.e. recursive least squares). If time_varying_regression is True, this must be set to False.

`True`
simple_differencing bool

Whether or not to use partially conditional maximum likelihood estimation.

  • If True, differencing is performed prior to estimation, which discards the first s*D + d initial rows but results in a smaller state-space formulation.
  • If False, the full SARIMAX model is put in state-space form so that all datapoints can be used in estimation.
`False`
enforce_stationarity bool

Whether or not to transform the AR parameters to enforce stationarity in the autoregressive component of the model.

`True`
enforce_invertibility bool

Whether or not to transform the MA parameters to enforce invertibility in the moving average component of the model.

`True`
hamilton_representation bool

Whether or not to use the Hamilton representation of an ARMA process (if True) or the Harvey representation (if False).

`False`
concentrate_scale bool

Whether or not to concentrate the scale (variance of the error term) out of the likelihood. This reduces the number of parameters estimated by maximum likelihood by one, but standard errors will then not be available for the scale parameter.

`False`
trend_offset int

The offset at which to start time trend values. Default is 1, so that if trend='t' the trend is equal to 1, 2, ..., nobs. Typically is only set when the model created by extending a previous dataset.

`1`
use_exact_diffuse bool

Whether or not to use exact diffuse initialization for non-stationary states. Default is False (in which case approximate diffuse initialization is used).

`False`
method str

The method determines which solver from scipy.optimize is used, and it can be chosen from among the following strings:

  • 'newton' for Newton-Raphson
  • 'nm' for Nelder-Mead
  • 'bfgs' for Broyden-Fletcher-Goldfarb-Shanno (BFGS)
  • 'lbfgs' for limited-memory BFGS with optional box constraints
  • 'powell' for modified Powell`s method
  • 'cg' for conjugate gradient
  • 'ncg' for Newton-conjugate gradient
  • 'basinhopping' for global basin-hopping solver
`'lbfgs'`
maxiter int

The maximum number of iterations to perform.

`50`
start_params numpy ndarray

Initial guess of the solution for the loglikelihood maximization. If None, the default is given by regressor.start_params.

`None`
disp bool

Set to True to print convergence messages.

`False`
sm_init_kwargs dict

Additional keyword arguments to pass to the statsmodels SARIMAX model when it is initialized.

`{}`
sm_fit_kwargs dict

Additional keyword arguments to pass to the fit method of the statsmodels SARIMAX model. The statsmodels SARIMAX.fit parameters method, max_iter, start_params and disp have been moved to the initialization of this model and will have priority over those provided by the user using via sm_fit_kwargs.

`{}`
sm_predict_kwargs dict

Additional keyword arguments to pass to the get_forecast method of the statsmodels SARIMAXResults object.

`{}`

Attributes:

Name Type Description
order tuple

The (p,d,q) order of the model for the number of AR parameters, differences, and MA parameters.

seasonal_order tuple

The (P,D,Q,s) order of the seasonal component of the model for the AR parameters, differences, MA parameters, and periodicity.

trend str

Deterministic trend polynomial A(t).

measurement_error bool

Whether or not to assume the endogenous observations y were measured with error.

time_varying_regression bool

Used when an explanatory variables, exog, are provided to select whether or not coefficients on the exogenous regressors are allowed to vary over time.

mle_regression bool

Whether or not to use estimate the regression coefficients for the exogenous variables as part of maximum likelihood estimation or through the Kalman filter (i.e. recursive least squares). If time_varying_regression is True, this must be set to False.

simple_differencing bool

Whether or not to use partially conditional maximum likelihood estimation.

enforce_stationarity bool

Whether or not to transform the AR parameters to enforce stationarity in the autoregressive component of the model.

enforce_invertibility bool

Whether or not to transform the MA parameters to enforce invertibility in the moving average component of the model.

hamilton_representation bool

Whether or not to use the Hamilton representation of an ARMA process (if True) or the Harvey representation (if False).

concentrate_scale bool

Whether or not to concentrate the scale (variance of the error term) out of the likelihood. This reduces the number of parameters estimated by maximum likelihood by one, but standard errors will then not be available for the scale parameter.

trend_offset int

The offset at which to start time trend values.

use_exact_diffuse bool

Whether or not to use exact diffuse initialization for non-stationary states.

method str

The method determines which solver from scipy.optimize is used.

maxiter int

The maximum number of iterations to perform.

start_params numpy ndarray

Initial guess of the solution for the loglikelihood maximization.

disp bool

Set to True to print convergence messages.

sm_init_kwargs dict

Additional keyword arguments to pass to the statsmodels SARIMAX model when it is initialized.

sm_fit_kwargs dict

Additional keyword arguments to pass to the fit method of the statsmodels SARIMAX model.

sm_predict_kwargs dict

Additional keyword arguments to pass to the get_forecast method of the statsmodels SARIMAXResults object.

_sarimax_params dict

Parameters of this model that can be set with the set_params method.

output_type str

Format of the object returned by the predict method. This is set automatically according to the type of y used in the fit method to train the model, 'numpy' or 'pandas'.

sarimax object

The statsmodels.tsa.statespace.sarimax.SARIMAX object created.

fitted bool

Tag to identify if the regressor has been fitted (trained).

sarimax_res object

The resulting statsmodels.tsa.statespace.sarimax.SARIMAXResults object created by statsmodels after fitting the SARIMAX model.

training_index pandas Index

Index of the training series as long as it is a pandas Series or Dataframe.

Source code in skforecast\Sarimax\Sarimax.py
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
def __init__(
    self,
    order: tuple = (1, 0, 0),
    seasonal_order: tuple = (0, 0, 0, 0),
    trend: str = None,
    measurement_error: bool = False,
    time_varying_regression: bool = False,
    mle_regression: bool = True,
    simple_differencing: bool = False,
    enforce_stationarity: bool = True,
    enforce_invertibility: bool = True,
    hamilton_representation: bool = False,
    concentrate_scale: bool = False,
    trend_offset: int = 1,
    use_exact_diffuse: bool = False,
    dates = None,
    freq = None,
    missing = 'none',
    validate_specification: bool = True,
    method: str = 'lbfgs',
    maxiter: int = 50,
    start_params: np.ndarray = None,
    disp: bool = False,
    sm_init_kwargs: dict = {},
    sm_fit_kwargs: dict = {},
    sm_predict_kwargs: dict = {}
) -> None:

    self.order                   = order
    self.seasonal_order          = seasonal_order
    self.trend                   = trend
    self.measurement_error       = measurement_error
    self.time_varying_regression = time_varying_regression
    self.mle_regression          = mle_regression
    self.simple_differencing     = simple_differencing
    self.enforce_stationarity    = enforce_stationarity
    self.enforce_invertibility   = enforce_invertibility
    self.hamilton_representation = hamilton_representation
    self.concentrate_scale       = concentrate_scale
    self.trend_offset            = trend_offset
    self.use_exact_diffuse       = use_exact_diffuse
    self.dates                   = dates
    self.freq                    = freq
    self.missing                 = missing
    self.validate_specification  = validate_specification
    self.method                  = method
    self.maxiter                 = maxiter
    self.start_params            = start_params
    self.disp                    = disp

    # Create the dictionaries with the additional statsmodels parameters to be  
    # used during the init, fit and predict methods. Note that the statsmodels 
    # SARIMAX.fit parameters `method`, `max_iter`, `start_params` and `disp` 
    # have been moved to the initialization of this model and will have 
    # priority over those provided by the user using via `sm_fit_kwargs`.
    self.sm_init_kwargs    = sm_init_kwargs
    self.sm_fit_kwargs     = sm_fit_kwargs
    self.sm_predict_kwargs = sm_predict_kwargs

    # Params that can be set with the `set_params` method
    _, _, _, _sarimax_params = inspect.getargvalues(inspect.currentframe())
    _sarimax_params.pop("self")
    self._sarimax_params = _sarimax_params

    self._consolidate_kwargs()

    # Create Results Attributes 
    self.output_type    = None
    self.sarimax        = None
    self.fitted         = False
    self.sarimax_res    = None
    self.training_index = None

_consolidate_kwargs()

Create the dictionaries to be used during the init, fit, and predict methods. Note that the parameters in this model's initialization take precedence over those provided by the user using via the statsmodels kwargs dicts.

Parameters:

Name Type Description Default
self
required

Returns:

Type Description
None
Source code in skforecast\Sarimax\Sarimax.py
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
def _consolidate_kwargs(
    self
) -> None:
    """
    Create the dictionaries to be used during the init, fit, and predict methods.
    Note that the parameters in this model's initialization take precedence 
    over those provided by the user using via the statsmodels kwargs dicts.

    Parameters
    ----------
    self

    Returns
    -------
    None

    """

    # statsmodels.tsa.statespace.SARIMAX parameters
    _init_kwargs = self.sm_init_kwargs.copy()
    _init_kwargs.update({
       'order': self.order,
       'seasonal_order': self.seasonal_order,
       'trend': self.trend,
       'measurement_error': self.measurement_error,
       'time_varying_regression': self.time_varying_regression,
       'mle_regression': self.mle_regression,
       'simple_differencing': self.simple_differencing,
       'enforce_stationarity': self.enforce_stationarity,
       'enforce_invertibility': self.enforce_invertibility,
       'hamilton_representation': self.hamilton_representation,
       'concentrate_scale': self.concentrate_scale,
       'trend_offset': self.trend_offset,
       'use_exact_diffuse': self.use_exact_diffuse,
       'dates': self.dates,
       'freq': self.freq,
       'missing': self.missing,
       'validate_specification': self.validate_specification
    })
    self._init_kwargs = _init_kwargs

    # statsmodels.tsa.statespace.SARIMAX.fit parameters
    _fit_kwargs = self.sm_fit_kwargs.copy()
    _fit_kwargs.update({
       'method': self.method,
       'maxiter': self.maxiter,
       'start_params': self.start_params,
       'disp': self.disp,
    })        
    self._fit_kwargs = _fit_kwargs

    # statsmodels.tsa.statespace.SARIMAXResults.get_forecast parameters
    self._predict_kwargs = self.sm_predict_kwargs.copy()

_create_sarimax(endog, exog=None)

A helper method to create a new statsmodel SARIMAX model.

Additional keyword arguments to pass to the statsmodels SARIMAX model when it is initialized can be added with the init_kwargs argument when initializing the model.

Parameters:

Name Type Description Default
endog numpy ndarray, pandas Series, pandas DataFrame

The endogenous variable.

required
exog numpy ndarray, pandas Series, pandas DataFrame

The exogenous variables.

`None`

Returns:

Type Description
None
Source code in skforecast\Sarimax\Sarimax.py
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
def _create_sarimax(
    self,
    endog: Union[np.ndarray, pd.Series, pd.DataFrame],
    exog: Optional[Union[np.ndarray, pd.Series, pd.DataFrame]]=None
) -> None:
    """
    A helper method to create a new statsmodel SARIMAX model.

    Additional keyword arguments to pass to the statsmodels SARIMAX model 
    when it is initialized can be added with the `init_kwargs` argument 
    when initializing the model.

    Parameters
    ----------
    endog : numpy ndarray, pandas Series, pandas DataFrame
        The endogenous variable.
    exog : numpy ndarray, pandas Series, pandas DataFrame, default `None`
        The exogenous variables.

    Returns
    -------
    None

    """

    self.sarimax = SARIMAX(endog=endog, exog=exog, **self._init_kwargs)

fit(y, exog=None)

Fit the model to the data.

Additional keyword arguments to pass to the fit method of the statsmodels SARIMAX model can be added with the fit_kwargs argument when initializing the model.

Parameters:

Name Type Description Default
y numpy ndarray, pandas Series, pandas DataFrame

Training time series.

required
exog numpy ndarray, pandas Series, pandas DataFrame

Exogenous variable/s included as predictor/s. Must have the same number of observations as y and their indexes must be aligned so that y[i] is regressed on exog[i].

`None`

Returns:

Type Description
None
Source code in skforecast\Sarimax\Sarimax.py
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
def fit(
    self,
    y: Union[np.ndarray, pd.Series, pd.DataFrame],
    exog: Optional[Union[np.ndarray, pd.Series, pd.DataFrame]]=None
) -> None:
    """
    Fit the model to the data.

    Additional keyword arguments to pass to the `fit` method of the
    statsmodels SARIMAX model can be added with the `fit_kwargs` argument 
    when initializing the model.

    Parameters
    ----------
    y : numpy ndarray, pandas Series, pandas DataFrame
        Training time series.
    exog : numpy ndarray, pandas Series, pandas DataFrame, default `None`
        Exogenous variable/s included as predictor/s. Must have the same
        number of observations as `y` and their indexes must be aligned so
        that y[i] is regressed on exog[i].

    Returns
    -------
    None

    """

    # Reset values in case the model has already been fitted.
    self.output_type    = None
    self.sarimax_res    = None
    self.fitted         = False
    self.training_index = None

    self.output_type = 'numpy' if isinstance(y, np.ndarray) else 'pandas'

    self._create_sarimax(endog=y, exog=exog)
    self.sarimax_res = self.sarimax.fit(**self._fit_kwargs)
    self.fitted = True

    if self.output_type == 'pandas':
        self.training_index = y.index

predict(steps, exog=None, return_conf_int=False, alpha=0.05)

Forecast future values and, if desired, their confidence intervals.

Generate predictions (forecasts) n steps in the future with confidence intervals. Note that if exogenous variables were used in the model fit, they will be expected for the predict procedure and will fail otherwise.

Additional keyword arguments to pass to the get_forecast method of the statsmodels SARIMAX model can be added with the predict_kwargs argument when initializing the model.

Parameters:

Name Type Description Default
steps int

Number of future steps predicted.

required
exog numpy ndarray, pandas Series, pandas DataFrame

Value of the exogenous variable/s for the next steps. The number of observations needed is the number of steps to predict.

`None`
return_conf_int bool

Whether to get the confidence intervals of the forecasts.

`False`
alpha float

The confidence intervals for the forecasts are (1 - alpha) %.

`0.05`

Returns:

Name Type Description
predictions numpy ndarray, pandas DataFrame

Values predicted by the forecaster and their estimated interval. The output type is the same as the type of y used in the fit method.

  • pred: predictions.
  • lower_bound: lower bound of the interval. (if return_conf_int)
  • upper_bound: upper bound of the interval. (if return_conf_int)
Source code in skforecast\Sarimax\Sarimax.py
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
@_check_fitted
def predict(
    self,
    steps: int,
    exog: Optional[Union[np.ndarray, pd.Series, pd.DataFrame]]=None, 
    return_conf_int: bool=False,
    alpha: float=0.05
) -> Union[np.ndarray, pd.DataFrame]:
    """
    Forecast future values and, if desired, their confidence intervals.

    Generate predictions (forecasts) n steps in the future with confidence
    intervals. Note that if exogenous variables were used in the model fit, 
    they will be expected for the predict procedure and will fail otherwise.

    Additional keyword arguments to pass to the `get_forecast` method of the
    statsmodels SARIMAX model can be added with the `predict_kwargs` argument 
    when initializing the model.

    Parameters
    ----------
    steps : int
        Number of future steps predicted.
    exog : numpy ndarray, pandas Series, pandas DataFrame, default `None`
        Value of the exogenous variable/s for the next steps. The number of 
        observations needed is the number of steps to predict. 
    return_conf_int : bool, default `False`
        Whether to get the confidence intervals of the forecasts.
    alpha : float, default `0.05`
        The confidence intervals for the forecasts are (1 - alpha) %.

    Returns
    -------
    predictions : numpy ndarray, pandas DataFrame
        Values predicted by the forecaster and their estimated interval. The 
        output type is the same as the type of `y` used in the fit method.

        - pred: predictions.
        - lower_bound: lower bound of the interval. (if `return_conf_int`)
        - upper_bound: upper bound of the interval. (if `return_conf_int`)

    """

    # This is done because statsmodels doesn't allow `exog` length greater than
    # the number of steps
    if exog is not None and len(exog) > steps:
        warnings.warn(
            (f"When predicting using exogenous variables, the `exog` parameter "
             f"must have the same length as the number of predicted steps. Since "
             f"len(exog) > steps, only the first {steps} observations are used.")
        )
        exog = exog[:steps]

    predictions = self.sarimax_res.get_forecast(
                      steps = steps,
                      exog  = exog,
                      **self._predict_kwargs
                  )

    if not return_conf_int:
        predictions = predictions.predicted_mean
        if self.output_type == 'pandas':
            predictions = predictions.rename("pred").to_frame()
    else:
        if self.output_type == 'numpy':
            predictions = np.column_stack(
                              [predictions.predicted_mean,
                               predictions.conf_int(alpha=alpha)]
                          )
        else:
            predictions = pd.concat((
                              predictions.predicted_mean,
                              predictions.conf_int(alpha=alpha)),
                              axis = 1
                          )
            predictions.columns = ['pred', 'lower_bound', 'upper_bound']

    return predictions

append(y, exog=None, refit=False, copy_initialization=False, **kwargs)

Recreate the results object with new data appended to the original data.

Creates a new result object applied to a dataset that is created by appending new data to the end of the model's original data. The new results can then be used for analysis or forecasting.

Parameters:

Name Type Description Default
y numpy ndarray, pandas Series, pandas DataFrame

New observations from the modeled time-series process.

required
exog numpy ndarray, pandas Series, pandas DataFrame

New observations of exogenous regressors, if applicable. Must have the same number of observations as y and their indexes must be aligned so that y[i] is regressed on exog[i].

`None`
refit bool

Whether to re-fit the parameters, based on the combined dataset.

`False`
copy_initialization bool

Whether or not to copy the initialization from the current results set to the new model.

`False`
**kwargs

Keyword arguments may be used to modify model specification arguments when created the new model object.

{}

Returns:

Type Description
None
Notes

The y and exog arguments to this method must be formatted in the same way (e.g. Pandas Series versus Numpy array) as were the y and exog arrays passed to the original model.

The y argument to this method should consist of new observations that occurred directly after the last element of y. For any other kind of dataset, see the apply method.

This method will apply filtering to all of the original data as well as to the new data. To apply filtering only to the new data (which can be much faster if the original dataset is large), see the extend method.

https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.mlemodel.MLEResults.append.html#statsmodels.tsa.statespace.mlemodel.MLEResults.append

Source code in skforecast\Sarimax\Sarimax.py
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
@_check_fitted
def append(
    self,
    y: Union[np.ndarray, pd.Series, pd.DataFrame],
    exog: Optional[Union[np.ndarray, pd.Series, pd.DataFrame]]=None,
    refit: bool=False,
    copy_initialization: bool=False,
    **kwargs
) -> None:
    """
    Recreate the results object with new data appended to the original data.

    Creates a new result object applied to a dataset that is created by 
    appending new data to the end of the model's original data. The new 
    results can then be used for analysis or forecasting.

    Parameters
    ----------
    y : numpy ndarray, pandas Series, pandas DataFrame
        New observations from the modeled time-series process.
    exog : numpy ndarray, pandas Series, pandas DataFrame, default `None`
        New observations of exogenous regressors, if applicable. Must have 
        the same number of observations as `y` and their indexes must be 
        aligned so that y[i] is regressed on exog[i].
    refit : bool, default `False`
        Whether to re-fit the parameters, based on the combined dataset.
    copy_initialization : bool, default `False`
        Whether or not to copy the initialization from the current results 
        set to the new model. 
    **kwargs
        Keyword arguments may be used to modify model specification arguments 
        when created the new model object.

    Returns
    -------
    None

    Notes
    -----
    The `y` and `exog` arguments to this method must be formatted in the same 
    way (e.g. Pandas Series versus Numpy array) as were the `y` and `exog` 
    arrays passed to the original model.

    The `y` argument to this method should consist of new observations that 
    occurred directly after the last element of `y`. For any other kind of 
    dataset, see the apply method.

    This method will apply filtering to all of the original data as well as 
    to the new data. To apply filtering only to the new data (which can be 
    much faster if the original dataset is large), see the extend method.

    https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.mlemodel.MLEResults.append.html#statsmodels.tsa.statespace.mlemodel.MLEResults.append

    """

    fit_kwargs = self._fit_kwargs if refit else None

    self.sarimax_res = self.sarimax_res.append(
                           endog               = y,
                           exog                = exog,
                           refit               = refit,
                           copy_initialization = copy_initialization,
                           fit_kwargs          = fit_kwargs,
                           **kwargs
                       )

apply(y, exog=None, refit=False, copy_initialization=False, **kwargs)

Apply the fitted parameters to new data unrelated to the original data.

Creates a new result object using the current fitted parameters, applied to a completely new dataset that is assumed to be unrelated to the model's original data. The new results can then be used for analysis or forecasting.

Parameters:

Name Type Description Default
y numpy ndarray, pandas Series, pandas DataFrame

New observations from the modeled time-series process.

required
exog numpy ndarray, pandas Series, pandas DataFrame

New observations of exogenous regressors, if applicable. Must have the same number of observations as y and their indexes must be aligned so that y[i] is regressed on exog[i].

`None`
refit bool

Whether to re-fit the parameters, using the new dataset.

`False`
copy_initialization bool

Whether or not to copy the initialization from the current results set to the new model.

`False`
**kwargs

Keyword arguments may be used to modify model specification arguments when created the new model object.

{}

Returns:

Type Description
None
Notes

The y argument to this method should consist of new observations that are not necessarily related to the original model's y dataset. For observations that continue that original dataset by follow directly after its last element, see the append and extend methods.

https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.mlemodel.MLEResults.apply.html#statsmodels.tsa.statespace.mlemodel.MLEResults.apply

Source code in skforecast\Sarimax\Sarimax.py
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
@_check_fitted
def apply(
    self,
    y: Union[np.ndarray, pd.Series, pd.DataFrame],
    exog: Optional[Union[np.ndarray, pd.Series, pd.DataFrame]]=None,
    refit: bool=False,
    copy_initialization: bool=False,
    **kwargs
) -> None:
    """
    Apply the fitted parameters to new data unrelated to the original data.

    Creates a new result object using the current fitted parameters, applied 
    to a completely new dataset that is assumed to be unrelated to the model's
    original data. The new results can then be used for analysis or forecasting.

    Parameters
    ----------
    y : numpy ndarray, pandas Series, pandas DataFrame
        New observations from the modeled time-series process.
    exog : numpy ndarray, pandas Series, pandas DataFrame, default `None`
        New observations of exogenous regressors, if applicable. Must have 
        the same number of observations as `y` and their indexes must be 
        aligned so that y[i] is regressed on exog[i].
    refit : bool, default `False`
        Whether to re-fit the parameters, using the new dataset.
    copy_initialization : bool, default `False`
        Whether or not to copy the initialization from the current results 
        set to the new model. 
    **kwargs
        Keyword arguments may be used to modify model specification arguments 
        when created the new model object.

    Returns
    -------
    None

    Notes
    -----
    The `y` argument to this method should consist of new observations that 
    are not necessarily related to the original model's `y` dataset. For 
    observations that continue that original dataset by follow directly after 
    its last element, see the append and extend methods.

    https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.mlemodel.MLEResults.apply.html#statsmodels.tsa.statespace.mlemodel.MLEResults.apply

    """

    fit_kwargs = self._fit_kwargs if refit else None

    self.sarimax_res = self.sarimax_res.apply(
                           endog               = y,
                           exog                = exog,
                           refit               = refit,
                           copy_initialization = copy_initialization,
                           fit_kwargs          = fit_kwargs,
                           **kwargs
                       )

extend(y, exog=None, **kwargs)

Recreate the results object for new data that extends the original data.

Creates a new result object applied to a new dataset that is assumed to follow directly from the end of the model's original data. The new results can then be used for analysis or forecasting.

Parameters:

Name Type Description Default
y numpy ndarray, pandas Series, pandas DataFrame

New observations from the modeled time-series process.

required
exog numpy ndarray, pandas Series, pandas DataFrame

New observations of exogenous regressors, if applicable. Must have the same number of observations as y and their indexes must be aligned so that y[i] is regressed on exog[i].

`None`
**kwargs

Keyword arguments may be used to modify model specification arguments when created the new model object.

{}

Returns:

Type Description
None
Notes

The y argument to this method should consist of new observations that occurred directly after the last element of the model's original y array. For any other kind of dataset, see the apply method.

This method will apply filtering only to the new data provided by the y argument, which can be much faster than re-filtering the entire dataset. However, the returned results object will only have results for the new data. To retrieve results for both the new data and the original data, see the append method.

https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.mlemodel.MLEResults.extend.html#statsmodels.tsa.statespace.mlemodel.MLEResults.extend

Source code in skforecast\Sarimax\Sarimax.py
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
@_check_fitted
def extend(
    self,
    y: Union[np.ndarray, pd.Series, pd.DataFrame],
    exog: Optional[Union[np.ndarray, pd.Series, pd.DataFrame]]=None,
    **kwargs
) -> None:
    """
    Recreate the results object for new data that extends the original data.

    Creates a new result object applied to a new dataset that is assumed to 
    follow directly from the end of the model's original data. The new 
    results can then be used for analysis or forecasting.

    Parameters
    ----------
    y : numpy ndarray, pandas Series, pandas DataFrame
        New observations from the modeled time-series process.
    exog : numpy ndarray, pandas Series, pandas DataFrame, default `None`
        New observations of exogenous regressors, if applicable. Must have 
        the same number of observations as `y` and their indexes must be 
        aligned so that y[i] is regressed on exog[i].
    **kwargs
        Keyword arguments may be used to modify model specification arguments 
        when created the new model object.

    Returns
    -------
    None

    Notes
    -----
    The `y` argument to this method should consist of new observations that 
    occurred directly after the last element of the model's original `y` 
    array. For any other kind of dataset, see the apply method.

    This method will apply filtering only to the new data provided by the `y` 
    argument, which can be much faster than re-filtering the entire dataset. 
    However, the returned results object will only have results for the new 
    data. To retrieve results for both the new data and the original data, 
    see the append method.

    https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.mlemodel.MLEResults.extend.html#statsmodels.tsa.statespace.mlemodel.MLEResults.extend

    """

    self.sarimax_res = self.sarimax_res.extend(
                           endog = y,
                           exog  = exog,
                           **kwargs
                       )

set_params(**params)

Set new values to the parameters of the regressor.

Parameters:

Name Type Description Default
params dict

Parameters values.

{}

Returns:

Type Description
None
Source code in skforecast\Sarimax\Sarimax.py
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
def set_params(
    self, 
    **params: dict
) -> None:
    """
    Set new values to the parameters of the regressor.

    Parameters
    ----------
    params : dict
        Parameters values.

    Returns
    -------
    None

    """

    params = {k:v for k,v in params.items() if k in self._sarimax_params}
    for key, value in params.items():
        setattr(self, key, value)

    self._consolidate_kwargs()

    # Reset values in case the model has already been fitted.
    self.output_type    = None
    self.sarimax_res    = None
    self.fitted         = False
    self.training_index = None

params()

Get the parameters of the model. The order of variables is the trend coefficients, the k_exog exogenous coefficients, the k_ar AR coefficients, and finally the k_ma MA coefficients.

Returns:

Name Type Description
params numpy ndarray, pandas Series

The parameters of the model.

Source code in skforecast\Sarimax\Sarimax.py
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
@_check_fitted
def params(
    self
) -> Union[np.ndarray, pd.Series]:
    """
    Get the parameters of the model. The order of variables is the trend
    coefficients, the `k_exog` exogenous coefficients, the `k_ar` AR 
    coefficients, and finally the `k_ma` MA coefficients.

    Returns
    -------
    params : numpy ndarray, pandas Series
        The parameters of the model.

    """

    return self.sarimax_res.params

summary(alpha=0.05, start=None)

Get a summary of the SARIMAXResults object.

Parameters:

Name Type Description Default
alpha float

The confidence intervals for the forecasts are (1 - alpha) %.

`0.05`
start int

Integer of the start observation.

`None`

Returns:

Name Type Description
summary Summary instance

This holds the summary table and text, which can be printed or converted to various output formats.

Source code in skforecast\Sarimax\Sarimax.py
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
@_check_fitted
def summary(
    self,
    alpha: float = 0.05,
    start: int = None
) -> object:
    """
    Get a summary of the SARIMAXResults object.

    Parameters
    ----------
    alpha : float, default `0.05`
        The confidence intervals for the forecasts are (1 - alpha) %.
    start : int, default `None`
        Integer of the start observation.

    Returns
    -------
    summary : Summary instance
        This holds the summary table and text, which can be printed or 
        converted to various output formats.

    """

    return self.sarimax_res.summary(alpha=alpha, start=start)

get_info_criteria(criteria='aic', method='standard')

Get the selected information criteria.

Check https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAXResults.info_criteria.html to know more about statsmodels info_criteria method.

Parameters:

Name Type Description Default
criteria str

The information criteria to compute. Valid options are {'aic', 'bic', 'hqic'}.

`'aic'`
method str

The method for information criteria computation. Default is 'standard' method; 'lutkepohl' computes the information criteria as in Lütkepohl (2007).

`'standard'`

Returns:

Name Type Description
metric float

The value of the selected information criteria.

Source code in skforecast\Sarimax\Sarimax.py
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
@_check_fitted
def get_info_criteria(
    self,
    criteria: str = 'aic',
    method: str = 'standard'
) -> float:
    """
    Get the selected information criteria.

    Check https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAXResults.info_criteria.html
    to know more about statsmodels info_criteria method.

    Parameters
    ----------
    criteria : str, default `'aic'`
        The information criteria to compute. Valid options are {'aic', 'bic',
        'hqic'}.
    method : str, default `'standard'`
        The method for information criteria computation. Default is 'standard'
        method; 'lutkepohl' computes the information criteria as in Lütkepohl
        (2007).

    Returns
    -------
    metric : float
        The value of the selected information criteria.

    """

    if criteria not in ['aic', 'bic', 'hqic']:
        raise ValueError(
            (f"Invalid value for `criteria`. Valid options are 'aic', 'bic', "
             f"and 'hqic'.")
        )

    if method not in ['standard', 'lutkepohl']:
        raise ValueError(
            (f"Invalid value for `method`. Valid options are 'standard' and "
             f"'lutkepohl'.")
        )

    metric = self.sarimax_res.info_criteria(criteria=criteria, method=method)

    return metric