This class turns statistical models into a Forecaster compatible with the
skforecast API. It supports single or multiple statistical models for the
same time series, enabling model comparison and ensemble predictions.
A statistical model instance or a list of statistical model instances.
When a list is provided, all models are fitted to the same time series
and predictions from all models are returned. Supported models are:
skforecast.stats.Arima
skforecast.stats.Arar
skforecast.stats.Ets
skforecast.stats.Sarimax (statsmodels wrapper)
sktime.forecasting.ARIMA (pdmarima wrapper)
aeon.forecasting.stats.ARIMA
aeon.forecasting.stats.ETS
None
transformer_y
object transformer (preprocessor)
An instance of a transformer (preprocessor) compatible with the scikit-learn
preprocessing API with methods: fit, transform, fit_transform and inverse_transform.
ColumnTransformers are not allowed since they do not have inverse_transform method.
The transformation is applied to y before training the forecaster.
None
transformer_exog
object transformer (preprocessor)
An instance of a transformer (preprocessor) compatible with the scikit-learn
preprocessing API. The transformation is applied to exog before training the
forecaster. inverse_transform is not available when using ColumnTransformers.
Unique identifiers for each estimator, generated from estimator types and
numeric suffixes to handle duplicates (e.g., 'skforecast.Arima',
'skforecast.Arima_2', 'skforecast.Ets'). Used to identify predictions
from each model.
Descriptive names for each estimator including the fitted model configuration
(e.g., 'Arima(1,1,1)(0,0,0)[12]', 'Ets(AAA)', etc.). This is updated
after fitting to reflect the selected model.
An instance of a transformer (preprocessor) compatible with the scikit-learn
preprocessing API with methods: fit, transform, fit_transform and inverse_transform.
ColumnTransformers are not allowed since they do not have inverse_transform method.
The transformation is applied to y before training the forecaster.
An instance of a transformer (preprocessor) compatible with the scikit-learn
preprocessing API. The transformation is applied to exog before training the
forecaster. inverse_transform is not available when using ColumnTransformers.
Last window the forecaster has seen during training. It stores the
values needed to predict the next step immediately after the training data. In the
statistical models it stores all the training data.
Index the forecaster has seen during training and prediction. This
attribute's initial value is the index of the training data, but this
is extended after predictions are made using an external 'last_window'.
Type of each exogenous variable/s used in training before the transformation
applied by transformer_exog. If transformer_exog is not used, it
is equal to exog_dtypes_out_.
Type of each exogenous variable/s used in training after the transformation
applied by transformer_exog. If transformer_exog is not used, it
is equal to exog_dtypes_in_.
Names of the exogenous variables included in the matrix X_train created
internally for training. It can be different from exog_names_in_ if
some exogenous variables are transformed during the training process.
def__init__(self,estimator:object|list[object]=None,transformer_y:object|None=None,transformer_exog:object|None=None,forecaster_id:str|int|None=None,regressor:object=None,fit_kwargs:Any=None,)->None:# Valid estimator types (class-level constant)self.valid_estimator_types=('skforecast.stats._arima.Arima','skforecast.stats._arar.Arar','skforecast.stats._ets.Ets','skforecast.stats._sarimax.Sarimax','aeon.forecasting.stats._arima.ARIMA','aeon.forecasting.stats._ets.ETS','sktime.forecasting.arima._pmdarima.ARIMA')# TODO: Remove 0.21. Handle deprecated 'regressor' argumentestimator=initialize_estimator(estimator,regressor)ifnotisinstance(estimator,list):estimator=[estimator]else:iflen(estimator)==0:raiseValueError("`estimator` list cannot be empty.")# Validate all estimators and collect typesestimator_types=[]fori,estinenumerate(estimator):est_type=f"{type(est).__module__}.{type(est).__name__}"ifest_typenotinself.valid_estimator_types:raiseTypeError(f"Estimator at index {i} must be an instance of type "f"{self.valid_estimator_types}. Got '{type(est)}'.")estimator_types.append(est_type)# TODO: Evaluate if include 'aggregate' parameter for multiple estimators, it# aggregates predictions from all estimators.self.estimators=estimatorself.estimators_=[copy(est)forestinself.estimators]self.estimator_ids=self._generate_ids(self.estimators)self.estimator_types=estimator_typesself.estimator_names_=[None]*len(self.estimators)self.n_estimators=len(self.estimators)self.transformer_y=transformer_yself.transformer_exog=transformer_exogself.last_window_=Noneself.extended_index_=Noneself.index_type_=Noneself.index_freq_=Noneself.training_range_=Noneself.series_name_in_=Noneself.exog_in_=Falseself.exog_names_in_=Noneself.exog_type_in_=Noneself.exog_dtypes_in_=Noneself.exog_dtypes_out_=Noneself.X_train_exog_names_out_=Noneself.creation_date=pd.Timestamp.today().strftime('%Y-%m-%d %H:%M:%S')self.is_fitted=Falseself.fit_date=Noneself.skforecast_version=__version__self.python_version=sys.version.split(" ")[0]self.forecaster_id=forecaster_idself.window_size=1# Ignored, present for API consistencyself.fit_kwargs=None# Ignored, present for API consistencyself.estimator_params_={est_id:est.get_params()forest_id,estinzip(self.estimator_ids,self.estimators_)}self.estimators_support_last_window=('skforecast.stats._sarimax.Sarimax',)self.estimators_support_exog=('skforecast.stats._arima.Arima','skforecast.stats._arar.Arar','skforecast.stats._sarimax.Sarimax','sktime.forecasting.arima._pmdarima.ARIMA',)self.estimators_support_interval=('skforecast.stats._arima.Arima','skforecast.stats._arar.Arar','skforecast.stats._ets.Ets','skforecast.stats._sarimax.Sarimax','sktime.forecasting.arima._pmdarima.ARIMA')self.estimators_support_reduce_memory=('skforecast.stats._arima.Arima','skforecast.stats._arar.Arar','skforecast.stats._ets.Ets')self._predict_dispatch={'skforecast.stats._arima.Arima':self._predict_skforecast_stats,'skforecast.stats._arar.Arar':self._predict_skforecast_stats,'skforecast.stats._ets.Ets':self._predict_skforecast_stats,'skforecast.stats._sarimax.Sarimax':self._predict_sarimax,'aeon.forecasting.stats._arima.ARIMA':self._predict_aeon,'aeon.forecasting.stats._ets.ETS':self._predict_aeon,'sktime.forecasting.arima._pmdarima.ARIMA':self._predict_sktime_arima}self._predict_interval_dispatch={'skforecast.stats._arima.Arima':self._predict_interval_skforecast_stats,'skforecast.stats._arar.Arar':self._predict_interval_skforecast_stats,'skforecast.stats._ets.Ets':self._predict_interval_skforecast_stats,'skforecast.stats._sarimax.Sarimax':self._predict_interval_sarimax,'sktime.forecasting.arima._pmdarima.ARIMA':self._predict_interval_sktime_arima,}self._feature_importances_dispatch={'skforecast.stats._arima.Arima':self._get_feature_importances_skforecast_stats,'skforecast.stats._arar.Arar':self._get_feature_importances_skforecast_stats,'skforecast.stats._ets.Ets':self._get_feature_importances_skforecast_stats,'skforecast.stats._sarimax.Sarimax':self._get_feature_importances_skforecast_stats,'aeon.forecasting.stats._arima.ARIMA':self._get_feature_importances_aeon_arima,'aeon.forecasting.stats._ets.ETS':self._get_feature_importances_aeon_ets,'sktime.forecasting.arima._pmdarima.ARIMA':self._get_feature_importances_sktime_arima}self._info_criteria_dispatch={'skforecast.stats._arima.Arima':self._get_info_criteria_skforecast_stats,'skforecast.stats._arar.Arar':self._get_info_criteria_skforecast_stats,'skforecast.stats._ets.Ets':self._get_info_criteria_skforecast_stats,'skforecast.stats._sarimax.Sarimax':self._get_info_criteria_sarimax,'aeon.forecasting.stats._arima.ARIMA':self._get_info_criteria_aeon,'aeon.forecasting.stats._ets.ETS':self._get_info_criteria_aeon,'sktime.forecasting.arima._pmdarima.ARIMA':self._get_info_criteria_sktime_arima}self.__skforecast_tags__={"library":"skforecast","forecaster_name":"ForecasterStats","forecaster_task":"regression","forecasting_scope":"single-series",# single-series | global"forecasting_strategy":"recursive",# recursive | direct | deep_learning"multiple_estimators":True,"index_types_supported":["pandas.RangeIndex","pandas.DatetimeIndex"],"requires_index_frequency":True,"allowed_input_types_series":["pandas.Series"],"supports_exog":True,"allowed_input_types_exog":["pandas.Series","pandas.DataFrame"],"handles_missing_values_series":False,"handles_missing_values_exog":False,"supports_lags":False,"supports_window_features":False,"supports_transformer_series":True,"supports_transformer_exog":True,"supports_weight_func":False,"supports_differentiation":False,"prediction_types":["point","interval"],"supports_probabilistic":True,"probabilistic_methods":["distribution"],"handles_binned_residuals":False}
def_generate_ids(self,estimators:list)->list[str]:""" Generate unique ids for a list of estimators. Handles duplicate ids by appending a numeric suffix. Parameters ---------- estimators : list List of statistical model instances. Returns ------- ids : list[str] List of unique ids for each estimator. """ids=[]id_counts={}forestinestimators:base_id=(f"{type(est).__module__.split('.')[0]}.{type(est).__name__}")# Track occurrences and add suffix for duplicatesifbase_idinid_counts:id_counts[base_id]+=1unique_id=f"{base_id}_{id_counts[base_id]}"else:id_counts[base_id]=1unique_id=base_idids.append(unique_id)returnids
defget_estimator(self,id:str)->object:""" Get a specific estimator by its id. Parameters ---------- id : str The id of the estimator to retrieve. Returns ------- estimator : object The requested estimator instance. """ifidnotinself.estimator_ids:raiseKeyError(f"No estimator with id '{id}'. "f"Available estimators: {self.estimator_ids}")idx=self.estimator_ids.index(id)returnself.estimators_[idx]
Source code in skforecast\recursive\_forecaster_stats.py
439440441442443444445446447448449450
defget_estimator_ids(self)->list[str]:""" Get the ids of all estimators in the forecaster. Returns ------- estimator_ids : list[str] List of estimator ids. """returnself.estimator_ids
defremove_estimators(self,ids:str|list[str])->None:""" Remove one or more estimators by their ids. Parameters ---------- ids : str, list[str] The ids of the estimators to remove. Returns ------- None """ifisinstance(ids,str):ids=[ids]missing_ids=[idforidinidsifidnotinself.estimator_ids]ifmissing_ids:raiseKeyError(f"No estimator(s) with id '{missing_ids}'. "f"Available estimators: {self.estimator_ids}")foridinids:idx=self.estimator_ids.index(id)delself.estimators[idx]delself.estimators_[idx]delself.estimator_ids[idx]delself.estimator_names_[idx]delself.estimator_types[idx]self.n_estimators-=1
def_preprocess_repr(self)->tuple[list[str],str]:""" Format text for __repr__ method. Returns ------- estimator_params : list[str] List of formatted parameters for each estimator. exog_names_in_ : str Formatted exogenous variable names. """# Format parameters for each estimatorestimator_params=[]foridinself.estimator_ids:params=str(self.estimator_params_[id])iflen(params)>58:params="\n "+textwrap.fill(params,width=76,subsequent_indent=" ")estimator_params.append(f"{id}: {params}")# Format exogenous variable namesexog_names_in_=Noneifself.exog_names_in_isnotNone:exog_names_in_=copy(self.exog_names_in_)iflen(exog_names_in_)>50:exog_names_in_=exog_names_in_[:50]+["..."]exog_names_in_=", ".join(exog_names_in_)iflen(exog_names_in_)>58:exog_names_in_="\n "+textwrap.fill(exog_names_in_,width=80,subsequent_indent=" ")returnestimator_params,exog_names_in_
Fits all estimators to the same time series. Each estimator is trained
independently on the transformed data.
Parameters:
Name
Type
Description
Default
y
pandas Series
Training time series.
required
exog
pandas Series, pandas DataFrame
Exogenous variable/s included as predictor/s. Must have the same
number of observations as y and their indexes must be aligned so
that y[i] is regressed on exog[i].
deffit(self,y:pd.Series,exog:pd.Series|pd.DataFrame|None=None,store_last_window:bool=True,suppress_warnings:bool=False)->None:""" Training Forecaster. Fits all estimators to the same time series. Each estimator is trained independently on the transformed data. Parameters ---------- y : pandas Series Training time series. exog : pandas Series, pandas DataFrame, default None Exogenous variable/s included as predictor/s. Must have the same number of observations as `y` and their indexes must be aligned so that y[i] is regressed on exog[i]. store_last_window : bool, default True Whether or not to store the last window (`last_window_`) of training data. suppress_warnings : bool, default False If `True`, warnings generated during fitting will be ignored. Returns ------- None """set_skforecast_warnings(suppress_warnings,action='ignore')self.estimators_=[copy(est)forestinself.estimators]self.estimator_names_=[None]*len(self.estimators)self.estimator_params_=Noneself.last_window_=Noneself.extended_index_=Noneself.index_type_=Noneself.index_freq_=Noneself.training_range_=Noneself.series_name_in_=Noneself.exog_in_=Falseself.exog_names_in_=Noneself.exog_type_in_=Noneself.exog_dtypes_in_=Noneself.exog_dtypes_out_=Noneself.X_train_exog_names_out_=Noneself.in_sample_residuals_=Noneself.is_fitted=Falseself.fit_date=Nonecheck_y(y=y)ifexogisnotNone:# NaNs are checked latercheck_exog(exog=exog)iflen(exog)!=len(y):raiseValueError(f"`exog` must have same number of samples as `y`. "f"length `exog`: ({len(exog)}), length `y`: ({len(y)})")unsupported_exog=[idforid,est_typeinzip(self.estimator_ids,self.estimator_types)ifest_typenotinself.estimators_support_exog]ifunsupported_exog:warnings.warn(f"The following estimators do not support exogenous variables and "f"will ignore them during fit: {unsupported_exog}",IgnoredArgumentWarning)y=transform_series(series=y,transformer=self.transformer_y,fit=True,inverse_transform=False)ifexogisnotNone:# NOTE: This must be here, before transforming exogself.exog_in_=Trueself.exog_type_in_=type(exog)self.exog_names_in_=(exog.columns.to_list()ifisinstance(exog,pd.DataFrame)else[exog.name])self.exog_dtypes_in_=get_exog_dtypes(exog=exog)ifisinstance(exog,pd.Series):exog=exog.to_frame()exog=transform_dataframe(df=exog,transformer=self.transformer_exog,fit=True,inverse_transform=False)check_exog_dtypes(exog,call_check_exog=True)self.exog_dtypes_out_=get_exog_dtypes(exog=exog)self.X_train_exog_names_out_=exog.columns.to_list()ifsuppress_warnings:withwarnings.catch_warnings():warnings.simplefilter("ignore")forestimatorinself.estimators_:estimator.fit(y=y,exog=exog)else:forestimatorinself.estimators_:estimator.fit(y=y,exog=exog)self.is_fitted=Truefori,estimatorinenumerate(self.estimators_):# Check if estimator has estimator_name_ attribute (skforecast models)ifhasattr(estimator,'estimator_name_')andestimator.estimator_name_isnotNone:self.estimator_names_[i]=estimator.estimator_name_else:self.estimator_names_[i]=f"{type(estimator).__module__.split('.')[0]}.{type(estimator).__name__}"self.estimator_params_={est_id:est.get_params()forest_id,estinzip(self.estimator_ids,self.estimators_)}self.series_name_in_=y.nameify.nameisnotNoneelse'y'self.fit_date=pd.Timestamp.today().strftime('%Y-%m-%d %H:%M:%S')self.training_range_=y.index[[0,-1]]self.index_type_=type(y.index)ifisinstance(y.index,pd.DatetimeIndex):self.index_freq_=y.index.freqstrelse:self.index_freq_=y.index.step# TODO: Check when multiple series are supportedifstore_last_window:self.last_window_=y.copy()# Set extended_index_ based on first SARIMAX estimator or default to y.indexfirst_sarimax=next((estforest,est_typeinzip(self.estimators_,self.estimator_types)ifest_type=='skforecast.stats._sarimax.Sarimax'),None)iffirst_sarimaxisnotNone:self.extended_index_=first_sarimax.sarimax_res.fittedvalues.index.copy()else:self.extended_index_=y.indexset_skforecast_warnings(suppress_warnings,action='default')
Create and validate inputs needed for the prediction process.
This method prepares the inputs required by the predict methods,
including validation of last_window and exog, and applying
transformations if configured.
Series values used to create the predictors needed in the
predictions. If last_window = None, the values stored in
self.last_window_ are used.
When provided, last_window must start right after the end of the
index seen by the forecaster during training. This is only supported
for skforecast.Sarimax estimator.
None
last_window_exog
pandas Series, pandas DataFrame
Values of the exogenous variables aligned with last_window. Only
needed when last_window is not None and the forecaster has been
trained including exogenous variables. Must start at the end
of the training data.
None
exog
pandas Series, pandas DataFrame
Exogenous variable/s included as predictor/s.
None
Returns:
Name
Type
Description
last_window
pandas Series
Transformed series values for prediction.
last_window_exog
pandas DataFrame, None
Transformed exogenous variables aligned with last_window.
exog
pandas DataFrame, None
Transformed exogenous variable/s for prediction.
prediction_index
pandas Index
Index for the predicted values, starting right after the end of
the training data.
Source code in skforecast\recursive\_forecaster_stats.py
def_create_predict_inputs(self,steps:int,last_window:pd.Series|None=None,last_window_exog:pd.Series|pd.DataFrame|None=None,exog:pd.Series|pd.DataFrame|None=None)->tuple[pd.Series,pd.DataFrame|None,pd.DataFrame|None,pd.Index]:""" Create and validate inputs needed for the prediction process. This method prepares the inputs required by the predict methods, including validation of `last_window` and `exog`, and applying transformations if configured. Parameters ---------- steps : int Number of steps to predict. last_window : pandas Series, default None Series values used to create the predictors needed in the predictions. If `last_window = None`, the values stored in `self.last_window_` are used. When provided, `last_window` must start right after the end of the index seen by the forecaster during training. This is only supported for skforecast.Sarimax estimator. last_window_exog : pandas Series, pandas DataFrame, default None Values of the exogenous variables aligned with `last_window`. Only needed when `last_window` is not None and the forecaster has been trained including exogenous variables. Must start at the end of the training data. exog : pandas Series, pandas DataFrame, default None Exogenous variable/s included as predictor/s. Returns ------- last_window : pandas Series Transformed series values for prediction. last_window_exog : pandas DataFrame, None Transformed exogenous variables aligned with `last_window`. exog : pandas DataFrame, None Transformed exogenous variable/s for prediction. prediction_index : pandas Index Index for the predicted values, starting right after the end of the training data. """# Needs to be a new variable to avoid arima_res_.append when using # self.last_window. It already has it stored.last_window_check=last_windowiflast_windowisnotNoneelseself.last_window_check_predict_input(forecaster_name=type(self).__name__,steps=steps,is_fitted=self.is_fitted,exog_in_=self.exog_in_,index_type_=self.index_type_,index_freq_=self.index_freq_,window_size=self.window_size,last_window=last_window_check,last_window_exog=last_window_exog,exog=exog,exog_names_in_=self.exog_names_in_,interval=None,alpha=None)iflast_windowisNoneandlast_window_exogisnotNone:raiseValueError("To make predictions unrelated to the original data, both ""`last_window` and `last_window_exog` must be provided.")# Check if forecaster needs exogiflast_windowisnotNoneandlast_window_exogisNoneandself.exog_in_:raiseValueError("Forecaster trained with exogenous variable/s. To make predictions ""unrelated to the original data, same variable/s must be provided ""using `last_window_exog`.")iflast_windowisnotNone:# If predictions do not follow directly from the end of the training # data. The internal statsmodels SARIMAX model needs to be updated # using its append method. The data needs to start at the end of the # training series.expected_index=expand_index(index=self.extended_index_,steps=1)[0]ifexpected_index!=last_window.index[0]:raiseValueError(f"To make predictions unrelated to the original data, `last_window` "f"has to start at the end of the index seen by the forecaster.\n"f" Series last index : {self.extended_index_[-1]}.\n"f" Expected index : {expected_index}.\n"f" `last_window` index start : {last_window.index[0]}.")last_window=last_window.copy()last_window=transform_series(series=last_window,transformer=self.transformer_y,fit=False,inverse_transform=False)iflast_window_exogisnotNone:ifexpected_index!=last_window_exog.index[0]:raiseValueError(f"To make predictions unrelated to the original data, `last_window_exog` "f"has to start at the end of the index seen by the forecaster.\n"f" Series last index : {self.extended_index_[-1]}.\n"f" Expected index : {expected_index}.\n"f" `last_window_exog` index start : {last_window_exog.index[0]}.")ifisinstance(last_window_exog,pd.Series):last_window_exog=last_window_exog.to_frame()last_window_exog=transform_dataframe(df=last_window_exog,transformer=self.transformer_exog,fit=False,inverse_transform=False)ifexogisnotNone:ifisinstance(exog,pd.Series):exog=exog.to_frame()exog=transform_dataframe(df=exog,transformer=self.transformer_exog,fit=False,inverse_transform=False)exog=exog.iloc[:steps,]# Prediction index starting right after the end of the training dataprediction_index=expand_index(index=self.last_window_.index,steps=steps)returnlast_window,last_window_exog,exog,prediction_index
Handle the last_window logic for prediction methods.
This method validates that SARIMAX estimators exist, warns about
unsupported estimators, appends the last_window data to SARIMAX
estimators, and returns the updated prediction index.
def_check_append_last_window(self,steps:int,last_window:pd.Series,last_window_exog:pd.DataFrame|None)->pd.Index:""" Handle the last_window logic for prediction methods. This method validates that SARIMAX estimators exist, warns about unsupported estimators, appends the last_window data to SARIMAX estimators, and returns the updated prediction index. Parameters ---------- steps : int Number of steps to predict. last_window : pandas Series Transformed series values for prediction. last_window_exog : pandas DataFrame, None Transformed exogenous variables aligned with `last_window`. Returns ------- prediction_index : pandas Index Updated index for the predicted values. """sarimax_indices=[ifori,estimator_typeinenumerate(self.estimator_types)ifestimator_type=='skforecast.stats._sarimax.Sarimax']ifnotsarimax_indices:raiseNotImplementedError("Prediction with `last_window` parameter is only supported for ""skforecast.Sarimax estimator. The forecaster does not contain any ""estimator that supports this feature.")unsupported_last_window=[idforid,estimator_typeinzip(self.estimator_ids,self.estimator_types)ifestimator_typenotinself.estimators_support_last_window]ifunsupported_last_window:warnings.warn(f"Prediction with `last_window` is not implemented for estimators: {unsupported_last_window}. "f"These estimators will be skipped. Available estimators for prediction "f"using `last_window` are: {list(self.estimators_support_last_window)}.",IgnoredArgumentWarning)foriinsarimax_indices:self.estimators_[i].append(y=last_window,exog=last_window_exog,refit=False)self.extended_index_=self.estimators_[sarimax_indices[0]].sarimax_res.fittedvalues.indexprediction_index=expand_index(index=self.extended_index_,steps=steps)returnprediction_index
Generate predictions (forecasts) n steps in the future using all
fitted estimators. If exogenous variables were used during training,
they must be provided for prediction.
When using last_window and last_window_exog, they must start right
after the end of the index seen by the forecaster during training.
This feature is only supported for skforecast.Sarimax estimator;
other estimators will ignore last_window and predict from the end
of the training data.
Series values used to create the predictors needed in the
predictions. Used to make predictions unrelated to the original data.
Values must start at the end of the training data. Only supported
for skforecast.Sarimax estimator.
None
last_window_exog
pandas Series, pandas DataFrame
Values of the exogenous variables aligned with last_window. Only
needed when last_window is not None and the forecaster has been
trained including exogenous variables. Values must start at the end
of the training data.
If True, skforecast warnings will be suppressed during the prediction
process. See skforecast.exceptions.warn_skforecast_categories for more
information.
False
Returns:
Name
Type
Description
predictions
pandas Series, pandas DataFrame
Predicted values from all estimators:
For multiple estimators: long format DataFrame with columns
'estimator_id' (estimator id) and 'pred' (predicted value).
For a single estimator: pandas Series with predicted values.
Source code in skforecast\recursive\_forecaster_stats.py
defpredict(self,steps:int,last_window:pd.Series|None=None,last_window_exog:pd.Series|pd.DataFrame|None=None,exog:pd.Series|pd.DataFrame|None=None,suppress_warnings:bool=False)->pd.Series|pd.DataFrame:""" Forecast future values. Generate predictions (forecasts) n steps in the future using all fitted estimators. If exogenous variables were used during training, they must be provided for prediction. When using `last_window` and `last_window_exog`, they must start right after the end of the index seen by the forecaster during training. This feature is only supported for skforecast.Sarimax estimator; other estimators will ignore `last_window` and predict from the end of the training data. Parameters ---------- steps : int Number of steps to predict. last_window : pandas Series, default None Series values used to create the predictors needed in the predictions. Used to make predictions unrelated to the original data. Values must start at the end of the training data. Only supported for skforecast.Sarimax estimator. last_window_exog : pandas Series, pandas DataFrame, default None Values of the exogenous variables aligned with `last_window`. Only needed when `last_window` is not None and the forecaster has been trained including exogenous variables. Values must start at the end of the training data. exog : pandas Series, pandas DataFrame, default None Exogenous variable/s included as predictor/s. suppress_warnings : bool, default False If `True`, skforecast warnings will be suppressed during the prediction process. See skforecast.exceptions.warn_skforecast_categories for more information. Returns ------- predictions : pandas Series, pandas DataFrame Predicted values from all estimators: - For multiple estimators: long format DataFrame with columns 'estimator_id' (estimator id) and 'pred' (predicted value). - For a single estimator: pandas Series with predicted values. """set_skforecast_warnings(suppress_warnings,action='ignore')last_window,last_window_exog,exog,prediction_index=(self._create_predict_inputs(steps=steps,last_window=last_window,last_window_exog=last_window_exog,exog=exog,))iflast_windowisnotNone:prediction_index=self._check_append_last_window(steps=steps,last_window=last_window,last_window_exog=last_window_exog)all_predictions=[]estimator_ids=[]forestimator,est_id,est_typeinzip(self.estimators_,self.estimator_ids,self.estimator_types):iflast_windowisnotNoneandest_typenotinself.estimators_support_last_window:continuepred_func=self._predict_dispatch[est_type]preds=pred_func(estimator=estimator,steps=steps,exog=exog)all_predictions.append(preds)estimator_ids.append(est_id)n_estimators=len(estimator_ids)ifn_estimators==1:all_predictions=all_predictions[0]else:all_predictions=np.column_stack(all_predictions).ravel()predictions=transform_numpy(array=all_predictions,transformer=self.transformer_y,fit=False,inverse_transform=True)ifself.n_estimators==1:predictions=pd.Series(data=predictions.ravel(),index=prediction_index,name='pred')else:predictions=pd.DataFrame({"estimator_id":np.tile(estimator_ids,steps),"pred":predictions.ravel()},index=np.repeat(prediction_index,n_estimators),)set_skforecast_warnings(suppress_warnings,action='default')returnpredictions
Source code in skforecast\recursive\_forecaster_stats.py
114011411142114311441145114611471148114911501151
def_predict_aeon(self,estimator:object,steps:int,exog:pd.Series|pd.DataFrame|None)->np.ndarray:# pragma: no cover"""Generate predictions using AEON models."""preds=estimator.iterative_forecast(y=self.last_window_.to_numpy(),prediction_horizon=steps)returnpreds
Source code in skforecast\recursive\_forecaster_stats.py
1153115411551156115711581159116011611162
def_predict_sktime_arima(self,estimator:object,steps:int,exog:pd.Series|pd.DataFrame|None)->np.ndarray:# pragma: no cover"""Generate predictions using sktime ARIMA model."""fh=np.arange(1,steps+1)preds=estimator.predict(fh=fh,X=exog).to_numpy()returnpreds
Forecast future values and their confidence intervals.
Generate predictions (forecasts) n steps in the future with confidence
intervals using fitted estimators that support prediction intervals.
If exogenous variables were used during training, they must be provided
for prediction.
Estimators that do not support prediction intervals will be skipped
with a warning. Supported estimators for intervals are the ones listed
in the attribute estimators_support_intervals.
When using last_window and last_window_exog, they must start right
after the end of the index seen by the forecaster during training.
This feature is only supported for skforecast.Sarimax estimator;
other estimators will ignore last_window and predict from the end
of the training data.
Series values used to create the predictors needed in the
predictions. Used to make predictions unrelated to the original data.
Values must start at the end of the training data. Only supported
for skforecast.Sarimax estimator.
None
last_window_exog
pandas Series, pandas DataFrame
Values of the exogenous variables aligned with last_window. Only
needed when last_window is not None and the forecaster has been
trained including exogenous variables.
Confidence of the prediction interval estimated. The values must be
symmetric. Sequence of percentiles to compute, which must be between
0 and 100 inclusive. For example, interval of 95% should be as
interval = [2.5, 97.5]. If both, alpha and interval are
provided, alpha will be used.
If True, skforecast warnings will be suppressed during the prediction
process. See skforecast.exceptions.warn_skforecast_categories for more
information.
False
Returns:
Name
Type
Description
predictions
pandas DataFrame
Predicted values from estimators that support intervals and their
estimated intervals:
For multiple estimators: long format DataFrame with columns
'estimator_id', 'pred', 'lower_bound', 'upper_bound'.
For a single estimator: DataFrame with columns
'pred', 'lower_bound', 'upper_bound'.
Source code in skforecast\recursive\_forecaster_stats.py
defpredict_interval(self,steps:int,last_window:pd.Series|None=None,last_window_exog:pd.Series|pd.DataFrame|None=None,exog:pd.Series|pd.DataFrame|None=None,alpha:float=0.05,interval:list[float]|tuple[float]|None=None,suppress_warnings:bool=False)->pd.DataFrame:""" Forecast future values and their confidence intervals. Generate predictions (forecasts) n steps in the future with confidence intervals using fitted estimators that support prediction intervals. If exogenous variables were used during training, they must be provided for prediction. Estimators that do not support prediction intervals will be skipped with a warning. Supported estimators for intervals are the ones listed in the attribute `estimators_support_intervals`. When using `last_window` and `last_window_exog`, they must start right after the end of the index seen by the forecaster during training. This feature is only supported for skforecast.Sarimax estimator; other estimators will ignore `last_window` and predict from the end of the training data. Parameters ---------- steps : int Number of steps to predict. last_window : pandas Series, default None Series values used to create the predictors needed in the predictions. Used to make predictions unrelated to the original data. Values must start at the end of the training data. Only supported for skforecast.Sarimax estimator. last_window_exog : pandas Series, pandas DataFrame, default None Values of the exogenous variables aligned with `last_window`. Only needed when `last_window` is not None and the forecaster has been trained including exogenous variables. exog : pandas Series, pandas DataFrame, default None Exogenous variable/s included as predictor/s. alpha : float, default 0.05 The confidence intervals for the forecasts are (1 - alpha) %. If both, `alpha` and `interval` are provided, `alpha` will be used. interval : list, tuple, default None Confidence of the prediction interval estimated. The values must be symmetric. Sequence of percentiles to compute, which must be between 0 and 100 inclusive. For example, interval of 95% should be as `interval = [2.5, 97.5]`. If both, `alpha` and `interval` are provided, `alpha` will be used. suppress_warnings : bool, default False If `True`, skforecast warnings will be suppressed during the prediction process. See skforecast.exceptions.warn_skforecast_categories for more information. Returns ------- predictions : pandas DataFrame Predicted values from estimators that support intervals and their estimated intervals: - For multiple estimators: long format DataFrame with columns 'estimator_id', 'pred', 'lower_bound', 'upper_bound'. - For a single estimator: DataFrame with columns 'pred', 'lower_bound', 'upper_bound'. """set_skforecast_warnings(suppress_warnings,action='ignore')# If interval and alpha take alpha, if interval transform to alphaifalphaisNone:if100-interval[1]!=interval[0]:raiseValueError(f"When using `interval` in ForecasterStats, it must be symmetrical. "f"For example, interval of 95% should be as `interval = [2.5, 97.5]`. "f"Got {interval}.")alpha=2*(100-interval[1])/100last_window,last_window_exog,exog,prediction_index=(self._create_predict_inputs(steps=steps,last_window=last_window,last_window_exog=last_window_exog,exog=exog,))iflast_windowisnotNone:prediction_index=self._check_append_last_window(steps=steps,last_window=last_window,last_window_exog=last_window_exog)unsupported_interval=[idforid,est_typeinzip(self.estimator_ids,self.estimator_types)ifest_typenotinself.estimators_support_interval]ifunsupported_interval:warnings.warn(f"Interval prediction is not implemented for estimators: {unsupported_interval}. "f"These estimators will be skipped. Available estimators for prediction "f"intervals are: {list(self.estimators_support_interval)}.",IgnoredArgumentWarning)all_predictions=[]estimator_ids=[]forestimator,est_id,est_typeinzip(self.estimators_,self.estimator_ids,self.estimator_types):ifest_typenotinself.estimators_support_interval:continueiflast_windowisnotNoneandest_typenotinself.estimators_support_last_window:continuepred_func=self._predict_interval_dispatch[est_type]preds=pred_func(estimator=estimator,steps=steps,exog=exog,alpha=alpha)all_predictions.append(preds)estimator_ids.append(est_id)n_estimators=len(estimator_ids)ifn_estimators==1:all_predictions=all_predictions[0]else:all_predictions=np.stack(all_predictions).transpose(1,0,2).reshape(-1,3)predictions=transform_numpy(array=all_predictions,transformer=self.transformer_y,fit=False,inverse_transform=True)predictions=pd.DataFrame(data=predictions,index=np.repeat(prediction_index,n_estimators),columns=['pred','lower_bound','upper_bound'])ifself.n_estimators==1:# This is done to restore the frequencypredictions.index=prediction_indexelse:predictions.insert(0,'estimator_id',np.tile(estimator_ids,steps))set_skforecast_warnings(suppress_warnings,action='default')returnpredictions
Generate prediction intervals using sktime ARIMA model.
Source code in skforecast\recursive\_forecaster_stats.py
13481349135013511352135313541355135613571358
def_predict_interval_sktime_arima(self,estimator:object,steps:int,exog:pd.Series|pd.DataFrame|None,alpha:float)->np.ndarray:# pragma: no cover"""Generate prediction intervals using sktime ARIMA model."""fh=np.arange(1,steps+1)preds=estimator.predict_interval(fh=fh,X=exog,coverage=1-alpha).to_numpy()returnpreds
Set new values to the parameters of the model stored in the forecaster.
After calling this method, the forecaster is reset to an unfitted state.
The fit method must be called before prediction.
Parameters values. The expected format depends on the number of
estimators in the forecaster:
Single estimator: A dictionary with parameter names as keys
and their new values as values.
Multiple estimators: A dictionary where each key is an
estimator id (as shown in estimator_ids) and each value
is a dictionary of parameters for that estimator.
required
Returns:
Type
Description
None
Source code in skforecast\recursive\_forecaster_stats.py
defset_params(self,params:dict[str,object]|dict[str,dict[str,object]])->None:""" Set new values to the parameters of the model stored in the forecaster. After calling this method, the forecaster is reset to an unfitted state. The `fit` method must be called before prediction. Parameters ---------- params : dict Parameters values. The expected format depends on the number of estimators in the forecaster: - Single estimator: A dictionary with parameter names as keys and their new values as values. - Multiple estimators: A dictionary where each key is an estimator id (as shown in `estimator_ids`) and each value is a dictionary of parameters for that estimator. Returns ------- None """ifself.n_estimators==1:# Single estimator: params is a simple dict of parameter valuesself.estimators[0]=clone(self.estimators[0])self.estimators[0].set_params(**params)else:# Multiple estimators: params must be a dict of dicts keyed by estimator nameifnotisinstance(params,dict):raiseTypeError(f"`params` must be a dictionary. Got {type(params).__name__}.")provided_ids=set(params.keys())valid_ids=set(self.estimator_ids)invalid_ids=provided_ids-valid_idsifinvalid_ids==provided_ids:raiseValueError(f"None of the provided estimator ids {list(invalid_ids)} "f"match the available estimator ids: {self.estimator_ids}.")ifinvalid_ids:warnings.warn(f"The following estimator ids do not match any estimator "f"in the forecaster and will be ignored: {list(invalid_ids)}. "f"Available estimator ids are: {self.estimator_ids}.",IgnoredArgumentWarning)forest_id,est_paramsinparams.items():ifest_idinvalid_ids:idx=self.estimator_ids.index(est_id)self.estimators[idx]=clone(self.estimators[idx])self.estimators[idx].set_params(**est_params)self.is_fitted=Falseself.estimator_params_={est_id:est.get_params()forest_id,estinzip(self.estimator_ids,self.estimators)}
This method is a placeholder to maintain API consistency. When using
the skforecast Sarimax model, fit kwargs should be passed using the
model parameter sm_fit_kwargs.
Parameters:
Name
Type
Description
Default
fit_kwargs
Ignored
Not used, present here for API consistency by convention.
None
Returns:
Type
Description
None
Source code in skforecast\recursive\_forecaster_stats.py
defset_fit_kwargs(self,fit_kwargs:Any=None)->None:""" This method is a placeholder to maintain API consistency. When using the skforecast Sarimax model, fit kwargs should be passed using the model parameter `sm_fit_kwargs`. Parameters ---------- fit_kwargs : Ignored Not used, present here for API consistency by convention. Returns ------- None """warnings.warn("This method is a placeholder to maintain API consistency. When using ""the skforecast Sarimax model, fit kwargs should be passed using the ""model parameter `sm_fit_kwargs`.",IgnoredArgumentWarning)
defget_feature_importances(self,sort_importance:bool=True)->pd.DataFrame:""" Return feature importances of the estimator stored in the forecaster. Parameters ---------- sort_importance: bool, default True If `True`, sorts the feature importances in descending order. Returns ------- feature_importances : pandas DataFrame Feature importances associated with each predictor. """ifnotself.is_fitted:raiseNotFittedError("This forecaster is not fitted yet. Call `fit` with appropriate ""arguments before using `get_feature_importances()`.")feature_importances=[]forestimator,estimator_type,estimator_idinzip(self.estimators_,self.estimator_types,self.estimator_ids):get_importances_func=self._feature_importances_dispatch[estimator_type]importance=get_importances_func(estimator)ifimportanceisnotNone:importance.insert(0,'estimator_id',estimator_id)feature_importances.append(importance)feature_importances=pd.concat(feature_importances,ignore_index=True)ifsort_importance:feature_importances=feature_importances.sort_values(by=['estimator_id','importance'],ascending=False).reset_index(drop=True)ifself.n_estimators==1:feature_importances=feature_importances.drop(columns=['estimator_id'])returnfeature_importances
Source code in skforecast\recursive\_forecaster_stats.py
1507150815091510151115121513
@staticmethoddef_get_feature_importances_aeon_arima(estimator)->pd.DataFrame:# pragma: no cover"""Get feature importances for AEON ARIMA model."""returnpd.DataFrame({'feature':[f'lag_{lag}'forlaginrange(1,estimator.p+1)]+["ma","intercept"],'importance':np.concatenate([estimator.phi_,estimator.theta_,[estimator.c_]])})
Source code in skforecast\recursive\_forecaster_stats.py
15151516151715181519
@staticmethoddef_get_feature_importances_aeon_ets(estimator)->pd.DataFrame:# pragma: no cover"""Get feature importances for AEON ETS model."""warnings.warn("Feature importances is not available for the AEON ETS model.")returnpd.DataFrame(columns=['feature','importance'])
Source code in skforecast\recursive\_forecaster_stats.py
152115221523152415251526
@staticmethoddef_get_feature_importances_sktime_arima(estimator)->pd.DataFrame:# pragma: no cover"""Get feature importances for sktime ARIMA model."""feature_importances=estimator._forecaster.params().to_frame().reset_index()feature_importances.columns=['feature','importance']returnfeature_importances
defget_info_criteria(self,criteria:str='aic',method:str='standard')->float:""" Get the selected information criteria. Check https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAXResults.info_criteria.html to know more about statsmodels info_criteria method. Parameters ---------- criteria : str, default 'aic' The information criteria to compute. Valid options are {'aic', 'bic', 'hqic'}. method : str, default 'standard' The method for information criteria computation. Default is 'standard' method; 'lutkepohl' computes the information criteria as in Lütkepohl (2007). Returns ------- metric : float The value of the selected information criteria. """ifnotself.is_fitted:raiseNotFittedError("This forecaster is not fitted yet. Call `fit` with appropriate ""arguments before using `get_info_criteria()`.")info_criteria=[]forestimator,estimator_typeinzip(self.estimators_,self.estimator_types):get_criteria_method=self._info_criteria_dispatch[estimator_type]value=get_criteria_method(estimator,criteria,method)info_criteria.append(value)ifself.n_estimators==1:results=pd.DataFrame({'criteria':criteria,'value':info_criteria})else:results=pd.DataFrame({'estimator_id':self.estimator_ids,'criteria':criteria,'value':info_criteria})returnresults
Get information criteria for SARIMAX statsmodels model.
Source code in skforecast\recursive\_forecaster_stats.py
15811582158315841585
@staticmethoddef_get_info_criteria_sarimax(estimator,criteria:str,method:str)->float:"""Get information criteria for SARIMAX statsmodels model."""returnestimator.get_info_criteria(criteria=criteria,method=method)
Get information criteria for skforecast Arima/Arar/Ets models.
Source code in skforecast\recursive\_forecaster_stats.py
15871588158915901591
@staticmethoddef_get_info_criteria_skforecast_stats(estimator,criteria:str,method:str)->float:"""Get information criteria for skforecast Arima/Arar/Ets models."""returnestimator.get_info_criteria(criteria=criteria)
Source code in skforecast\recursive\_forecaster_stats.py
159315941595159615971598159916001601160216031604
@staticmethoddef_get_info_criteria_aeon(estimator,criteria:str,method:str)->float:# pragma: no cover"""Get information criteria for AEON models."""ifcriteria!='aic':raiseValueError("Invalid value for `criteria`. Only 'aic' is supported for ""AEON models.")returnestimator.aic_
Source code in skforecast\recursive\_forecaster_stats.py
16061607160816091610161116121613161416151616
@staticmethoddef_get_info_criteria_sktime_arima(estimator,criteria:str,method:str)->float:# pragma: no cover"""Get information criteria for sktime ARIMA model."""ifcriterianotin{'aic','bic','hqic'}:raiseValueError("`criteria` must be one of {'aic','bic','hqic'}")ifmethodnotin{'standard','lutkepohl'}:raiseValueError("`method` must be either 'standard' or 'lutkepohl'")returnestimator._forecaster.arima_res_.info_criteria(criteria=criteria,method=method)
Get a summary DataFrame with information about all estimators in the
forecaster.
Returns:
Name
Type
Description
info
pandas DataFrame
DataFrame with columns:
- id: Unique identifier for each estimator.
- name: Descriptive name (available after fitting).
- type: Full qualified type string.
- supports_exog: Whether the estimator supports exogenous variables.
- supports_interval: Whether the estimator supports prediction intervals.
- params: Dictionary of the estimator parameters.
Source code in skforecast\recursive\_forecaster_stats.py
defget_estimators_info(self)->pd.DataFrame:""" Get a summary DataFrame with information about all estimators in the forecaster. Returns ------- info : pandas DataFrame DataFrame with columns: - id: Unique identifier for each estimator. - name: Descriptive name (available after fitting). - type: Full qualified type string. - supports_exog: Whether the estimator supports exogenous variables. - supports_interval: Whether the estimator supports prediction intervals. - params: Dictionary of the estimator parameters. """supports_exog=[est_typeinself.estimators_support_exogforest_typeinself.estimator_types]supports_interval=[est_typeinself.estimators_support_intervalforest_typeinself.estimator_types]params=[str(est_params)forest_paramsinself.estimator_params_.values()]info=pd.DataFrame({'id':self.estimator_ids,'name':self.estimator_names_,'type':self.estimator_types,'supports_exog':supports_exog,'supports_interval':supports_interval,'params':params})returninfo
Reduce memory usage by removing internal arrays of the estimator not
needed for prediction. This method only works for estimators that
expose the method reduce_memory().
The arrays removed depend on the specific estimator used.
Returns:
Type
Description
None
Source code in skforecast\recursive\_forecaster_stats.py
defreduce_memory(self)->None:""" Reduce memory usage by removing internal arrays of the estimator not needed for prediction. This method only works for estimators that expose the method `reduce_memory()`. The arrays removed depend on the specific estimator used. Returns ------- None """ifnotself.is_fitted:raiseNotFittedError("This forecaster is not fitted yet. Call `fit` with appropriate ""arguments before using `reduce_memory()`.")unsupported_reduce_memory=[est_idforest_id,estimator_typeinzip(self.estimator_ids,self.estimator_types)ifestimator_typenotinself.estimators_support_reduce_memory]ifunsupported_reduce_memory:warnings.warn(f"Memory reduction is not implemented for estimators: {unsupported_reduce_memory}. "f"These estimators will be skipped. Available estimators for memory "f"reduction are: {list(self.estimators_support_reduce_memory)}.",IgnoredArgumentWarning)forestimator,est_typeinzip(self.estimators_,self.estimator_types):ifest_typeinself.estimators_support_reduce_memory:estimator.reduce_memory()