model_selection
¶
skforecast.model_selection.
backtesting_forecaster
(
forecaster
, y
, initial_train_size
, steps
, metric
, exog=None
, verbose=False
)
Backtesting (validation) of ForecasterAutoreg
, ForecasterCustom
,
ForecasterAutoregCustom
or ForecasterAutoregMultiOutput
object.
The model is trained only once using the initial_train_size
first observations.
In each iteration, a number of steps
predictions are evaluated.
This evaluation is much faster than cv_forecaster()
since the model is
trained only once.
forecaster
(ForecasterAutoreg, ForecasterCustom, ForecasterAutoregCustom,) — ForecasterAutoregMultiOutput erAutoreg,
ForecasterCustomForecasterAutoregCustom
or erAutoregMultiOutput
object.y
(1D np.ndarray, pd.Series) — Training time series values.initial_train_size
(int) — Number of samples in the initial train split.steps
(int) — Number of steps to predict.metric
({'mean_squared_error', 'mean_absolute_error', 'mean_absolute_percentage_error'}) — Metric used to quantify the goodness of fit of the model.exog
(np.ndarray, pd.Series, pd.DataFrame, default `None`) — Exogenous variable/s included as predictor/s. Must have the same number of observations asy
and should be aligned so that y[i] is regressed on exog[i].verbose
(bool, default `False`) — Print number of folds used for backtesting.
Value of the metric.
test_predictions: 1D np.ndarray Value of predictions.
skforecast.model_selection.
backtesting_forecaster_intervals
(
forecaster
, y
, initial_train_size
, steps
, metric
, exog=None
, interval=[5, 95]
, n_boot=500
, in_sample_residuals=True
, verbose=False
)
Backtesting (validation) of ForecasterAutoreg
, ForecasterCustom
or
ForecasterAutoregCustom
object. The model is trained only once using the
initial_train_size
first observations. In each iteration, a number of
steps
predictions are evaluated. Both, predictions and intervals, are
calculated.
This evaluation is much faster than cv_forecaster()
since the model is
trained only once.
forecaster
(ForecasterAutoreg, ForecasterCustom, ForecasterAutoregCustom) —ForecasterAutoreg
,ForecasterCustom
orForecasterAutoregCustom
object.y
(1D np.ndarray, pd.Series) — Training time series values.initial_train_size
(int) — Number of samples in the initial train split.steps
(int) — Number of steps to predict.metric
({'mean_squared_error', 'mean_absolute_error', 'mean_absolute_percentage_error'}) — Metric used to quantify the goodness of fit of the model.exog
(np.ndarray, pd.Series, pd.DataFrame, default `None`) — Exogenous variable/s included as predictor/s. Must have the same number of observations asy
and should be aligned so that y[i] is regressed on exog[i].interval
(list, default `[5, 100]`) — Confidence of the prediction interval estimated. Sequence of percentiles to compute, which must be between 0 and 100 inclusive.n_boot
(int, default `500`) — Number of bootstrapping iterations used to estimate prediction intervals.in_sample_residuals
(bool, default `True`) — IfTrue
, residuals from the training data are used as proxy of prediction error to create prediction intervals.verbose
(bool, default `True`) — Print number of folds used for backtesting.
If include_intervals=True
, 2D np.ndarray shape(steps, 3) with predicted
value and their estimated interval.
Column 0 = predictions
Column 1 = lower bound interval
Column 2 = upper bound interval
ic_value: np.ndarray shape (1,) Value of the metric.
Notes
More information about prediction intervals in forecasting: https://otexts.com/fpp2/prediction-intervals.html Forecasting: Principles and Practice (2nd ed) Rob J Hyndman and George Athanasopoulos.
skforecast.model_selection.
cv_forecaster
(
forecaster
, y
, initial_train_size
, steps
, metric
, exog=None
, allow_incomplete_fold=True
, verbose=True
)
Cross-validation of ForecasterAutoreg
, ForecasterCustom
, ForecasterAutoregCustom
or ForecasterAutoregMultiOutput
object. The order of data is maintained
and the training set increases in each iteration.
forecaster
(ForecasterAutoreg, ForecasterCustom, ForecasterAutoregCustom,) — ForecasterAutoregMultiOutput erAutoreg,
ForecasterCustomForecasterAutoregCustom
or erAutoregMultiOutput
object.y
(1D np.ndarray, pd.Series) — Training time series values.initial_train_size
(int) — Number of samples in the initial train split.steps
(int) — Number of steps to predict.metric
({'mean_squared_error', 'mean_absolute_error', 'mean_absolute_percentage_error'}) — Metric used to quantify the goodness of fit of the model.exog
(np.ndarray, pd.Series, pd.DataFrame, default `None`) — Exogenous variable/s included as predictor/s. Must have the same number of observations asy
and should be aligned so that y[i] is regressed on exog[i].allow_incomplete_fold
(bool, default `True`) — The last test set is allowed to be incomplete if it does not reachsteps
observations. Otherwise, the latest observations are discarded.verbose
(bool, default `True`) — Print number of folds used for cross validation.
Value of the metric for each fold.
redictions: 1D np.ndarray Predictions.
skforecast.model_selection.
grid_search_forecaster
(
forecaster
, y
, param_grid
, initial_train_size
, steps
, metric
, exog=None
, lags_grid=None
, method='cv'
, allow_incomplete_fold=True
, return_best=True
, verbose=True
)
Exhaustive search over specified parameter values for a Forecaster object. Validation is done using time series cross-validation or backtesting.
forecaster
(ForecasterAutoreg, ForecasterCustom, ForecasterAutoregCustom,) — ForecasterAutoregMultiOutput erAutoreg,
ForecasterCustomForecasterAutoregCustom
or erAutoregMultiOutput
object.y
(1D np.ndarray, pd.Series) — Training time series values.param_grid
(dict) — Dictionary with parameters names (str
) as keys and lists of parameter settings to try as values.initial_train_size
(int) — Number of samples in the initial train split.steps
(int) — Number of steps to predict.metric
({'mean_squared_error', 'mean_absolute_error', 'mean_absolute_percentage_error'}) — Metric used to quantify the goodness of fit of the model.exog
(np.ndarray, pd.Series, pd.DataFrame, default `None`) — Exogenous variable/s included as predictor/s. Must have the same number of observations asy
and should be aligned so that y[i] is regressed on exog[i].lags_grid
(list of int, lists, np.narray or range.) — Lists oflags
to try. Only used if forecaster is an instance ofForecasterAutoreg
.method
({'cv', 'backtesting'}) — Method used to estimate the metric for each parameter combination. 'cv' for time series crosvalidation and 'backtesting' for simple backtesting. 'backtesting' is much faster since the model is fitted only once.allow_incomplete_fold
(bool, default `True`) — The last test set is allowed to be incomplete if it does not reachsteps
observations. Otherwise, the latest observations are discarded.return_best
(bool) — Refit theforecaster
using the best found parameters on the whole data.verbose
(bool, default `True`) — Print number of folds used for cv or backtesting.
Metric value estimated for each combination of parameters.
skforecast.model_selection.
time_series_spliter
(
y
, initial_train_size
, steps
, allow_incomplete_fold=True
, verbose=True
)
Split indices of a time series into multiple train-test pairs. The order of is maintained and the training set increases in each iteration.
y
(1D np.ndarray, pd.Series) — Training time series values.initial_train_size
(int) — Number of samples in the initial train split.steps
(int) — Number of steps to predict.allow_incomplete_fold
(bool, default `True`) — The last test set is allowed to be incomplete if it does not reachsteps
observations. Otherwise, the latest observations are discarded.verbose
(bool, default `True`) — Print number of splits created.
Training indices.
: 1D np.ndarray Test indices.