Change Log¶
All significant changes to this project are documented in this release file.
[0.9.0] - [2023-07-09]¶
The main changes in this release are:
-
ForecasterAutoregDirect
andForecasterAutoregMultiVariate
include then_jobs
argument in theirfit
method, allowing multi-process parallelization for improved performance. -
All backtesting and grid search functions have been extended to include the
n_jobs
argument, allowing multi-process parallelization for improved performance. -
Argument
refit
now can be also aninteger
in all backtesting dependent functions in modulesmodel_selection
,model_selection_multiseries
, andmodel_selection_sarimax
. This allows the Forecaster to be trained every this number of iterations. -
ForecasterAutoregMultiSeries
andForecasterAutoregMultiSeriesCustom
can be trained using series of different lengths. This means that the model can handle datasets with different numbers of data points in each series.
Added
-
Support for
scikit-learn 1.3.x
. -
Argument
n_jobs='auto'
tofit
method inForecasterAutoregDirect
andForecasterAutoregMultiVariate
to allow multi-process parallelization. -
Argument
n_jobs='auto'
to all backtesting dependent functions in modulesmodel_selection
,model_selection_multiseries
andmodel_selection_sarimax
to allow multi-process parallelization. -
Argument
refit
now can be also aninteger
in all backtesting dependent functions in modulesmodel_selection
,model_selection_multiseries
, andmodel_selection_sarimax
. This allows the Forecaster to be trained every this number of iterations. -
ForecasterAutoregMultiSeries
andForecasterAutoregMultiSeriesCustom
allow to use series of different lengths for training. -
Added
show_progress
to grid search functions. -
Added functions
select_n_jobs_backtesting
andselect_n_jobs_fit_forecaster
toutils
to select the number of jobs to use during multi-process parallelization.
Changed
-
Remove
get_feature_importance
in favor ofget_feature_importances
in all Forecasters, (deprecated since 0.8.0). -
The
model_selection._create_backtesting_folds
function now also returns the last window indices and whether or not to train the forecaster. -
The
model_selection
functions_backtesting_forecaster_refit
and_backtesting_forecaster_no_refit
have been unified in_backtesting_forecaster
. -
The
model_selection_multiseries
functions_backtesting_forecaster_multiseries_refit
and_backtesting_forecaster_multiseries_no_refit
have been unified in_backtesting_forecaster_multiseries
. -
The
model_selection_sarimax
functions_backtesting_refit_sarimax
and_backtesting_no_refit_sarimax
have been unified in_backtesting_sarimax
. -
utils.preprocess_y
allows a pandas DataFrame as input.
Fixed
-
Ensure reproducibility of Direct Forecasters when using
predict_bootstrapping
,predict_dist
andpredict_interval
with alist
of steps. -
The
create_train_X_y
method returns a dict of pandas Series asy_train
inForecasterAutoregDirect
andForecasterAutoregMultiVariate
. This ensures that each series has the appropriate index according to the step to be trained. -
The
filter_train_X_y_for_step
method inForecasterAutoregDirect
andForecasterAutoregMultiVariate
now updates the index ofX_train_step
to ensure correct alignment withy_train_step
.
[0.8.1] - [2023-05-27]¶
Added
- Argument
store_in_sample_residuals=True
infit
method added to all forecasters to speed up functions such as backtesting.
Changed
- Refactor
utils.exog_to_direct
andutils.exog_to_direct_numpy
to increase performance.
Fixed
utils.check_exog_dtypes
now compares thedtype.name
instead of thedtype
. (suggested by Metaming https://github.com/Metaming)
[0.8.0] - [2023-05-16]¶
Added
-
Added the
fit_kwargs
argument to all forecasters to allow the inclusion of additional keyword arguments passed to the regressor'sfit
method. -
Added the
set_fit_kwargs
method to set thefit_kwargs
attribute. -
Support for
pandas 2.0.x
. -
Added
exceptions
module with custom warnings. -
Added function
utils.check_exog_dtypes
to issue a warning if exogenous variables are one of typeinit
,float
, orcategory
. Raise Exception ifexog
has categorical columns with non integer values. -
Added function
utils.get_exog_dtypes
to get the data types of the exogenous variables included during the training of the forecaster model. -
Added function
utils.cast_exog_dtypes
to cast data types of the exogenous variables using a dictionary as a mapping. -
Added function
utils.check_select_fit_kwargs
to check if the argumentfit_kwargs
is a dictionary and select only the keys used by thefit
method of the regressor. -
Added function
model_selection._create_backtesting_folds
to provide train/test indices (position) for backtesting functions. -
Added argument
gap
to functions inmodel_selection
,model_selection_multiseries
andmodel_selection_sarimax
to omit observations between training and prediction. -
Added argument
show_progress
to functionsmodel_selection.backtesting_forecaster
,model_selection_multiseries.backtesting_forecaster_multiseries
andmodel_selection_sarimax.backtesting_forecaster_sarimax
to indicate weather to show a progress bar. -
Added argument
remove_suffix
, defaultFalse
, to the methodfilter_train_X_y_for_step()
inForecasterAutoregDirect
andForecasterAutoregMultiVariate
. Ifremove_suffix=True
the suffix "_step_i" will be removed from the column names of the training matrices.
Changed
-
Rename optional dependency package
statsmodels
tosarimax
. Now onlypmdarima
will be installed,statsmodels
is no longer needed. -
Rename
get_feature_importance()
toget_feature_importances()
in all Forecasters.get_feature_importance()
method will me removed in skforecast 0.9.0. -
Refactor
get_feature_importances()
in all Forecasters. -
Remove
model_selection_statsmodels
in favor ofForecasterSarimax
andmodel_selection_sarimax
, (deprecated since 0.7.0). -
Remove attributes
create_predictors
andsource_code_create_predictors
in favor offun_predictors
andsource_code_fun_predictors
inForecasterAutoregCustom
, (deprecated since 0.7.0). -
The
utils.check_exog
function now includes a new optional parameter,allow_nan
, that controls whether a warning should be issued if the inputexog
contains NaN values. -
utils.check_exog
is applied before and afterexog
transformations. -
The
utils.preprocess_y
function now includes a new optional parameter,return_values
, that controls whether to return a numpy ndarray with the values of y or not. This new option is intended to avoid copying data when it is not necessary. -
The
utils.preprocess_exog
function now includes a new optional parameter,return_values
, that controls whether to return a numpy ndarray with the values of y or not. This new option is intended to avoid copying data when it is not necessary. -
Replaced
tqdm.tqdm
bytqdm.auto.tqdm
. -
Refactor
utils.exog_to_direct
.
Fixed
- The dtypes of exogenous variables are maintained when generating the training matrices with the
create_train_X_y
method in all the Forecasters.
[0.7.0] - [2023-03-21]¶
Added
-
Class
ForecasterAutoregMultiSeriesCustom
. -
Class
ForecasterSarimax
andmodel_selection_sarimax
(wrapper of pmdarima). -
Method
predict_interval()
toForecasterAutoregDirect
andForecasterAutoregMultiVariate
. -
Method
predict_bootstrapping()
to all forecasters, generate multiple forecasting predictions using a bootstrapping process. -
Method
predict_dist()
to all forecasters, fit a given probability distribution for each step using a bootstrapping process. -
Function
plot_prediction_distribution
in moduleplot
. -
Alias
backtesting_forecaster_multivariate
forbacktesting_forecaster_multiseries
inmodel_selection_multiseries
module. -
Alias
grid_search_forecaster_multivariate
forgrid_search_forecaster_multiseries
inmodel_selection_multiseries
module. -
Alias
random_search_forecaster_multivariate
forrandom_search_forecaster_multiseries
inmodel_selection_multiseries
module. -
Attribute
forecaster_id
to all Forecasters.
Changed
-
Deprecated
python 3.7
compatibility. -
Added
python 3.11
compatibility. -
model_selection_statsmodels
is deprecated in favor ofForecasterSarimax
andmodel_selection_sarimax
. It will be removed in version 0.8.0. -
Remove
levels_weights
argument ingrid_search_forecaster_multiseries
andrandom_search_forecaster_multiseries
, deprecated since version 0.6.0. Useseries_weights
andweights_func
when creating the forecaster instead. -
Attributes
create_predictors
andsource_code_create_predictors
renamed tofun_predictors
andsource_code_fun_predictors
inForecasterAutoregCustom
. Old names will be removed in version 0.8.0. -
Remove engine
'skopt'
inbayesian_search_forecaster
in favor of engine'optuna'
. To continue using it, use skforecast 0.6.0. -
in_sample_residuals
andout_sample_residuals
are stored as numpy ndarrays instead of pandas series. -
In
ForecasterAutoregMultiSeries
,set_out_sample_residuals()
is now expecting adict
for theresiduals
argument instead of apandas DataFrame
. -
Remove the
scikit-optimize
dependency.
Fixed
-
Remove operator
**
inset_params()
method for all forecasters. -
Replace
getfullargspec
in favor ofinspect.signature
(contribution by @jordisilv).
[0.6.0] - [2022-11-30]¶
Added
-
Class
ForecasterAutoregMultivariate
. -
Function
initialize_lags
inutils
module to create lags values in the initialization of forecasters (applies to all forecasters). -
Function
initialize_weights
inutils
module to check and initialize argumentsseries_weights
andweight_func
(applies to all forecasters). -
Argument
weights_func
in all Forecasters to allow weighted time series forecasting. Individual time based weights can be assigned to each value of the series during the model training. -
Argument
series_weights
inForecasterAutoregMultiSeries
to define individual weights each series. -
Include argument
random_state
in all Forecastersset_out_sample_residuals
methods for random sampling with reproducible output. -
In
ForecasterAutoregMultiSeries
,predict
andpredict_interval
methods allow the simultaneous prediction of multiple levels. -
backtesting_forecaster_multiseries
allows backtesting multiple levels simultaneously. -
metric
argument can be a list ingrid_search_forecaster_multiseries
,random_search_forecaster_multiseries
. Ifmetric
is alist
, multiple metrics will be calculated. (suggested by Pablo Dávila Herrero https://github.com/Pablo-Davila) -
Function
multivariate_time_series_corr
in moduleutils
. -
Function
plot_multivariate_time_series_corr
in moduleplot
.
Changed
-
ForecasterAutoregDirect
allows to predict specific steps. -
Remove
ForecasterAutoregMultiOutput
in favor ofForecasterAutoregDirect
, (deprecated since 0.5.0). -
Rename function
exog_to_multi_output
toexog_to_direct
inutils
module. -
In
ForecasterAutoregMultiSeries
, rename parameterseries_levels
toseries_col_names
. -
In
ForecasterAutoregMultiSeries
change type ofout_sample_residuals
to adict
of numpy ndarrays. -
In
ForecasterAutoregMultiSeries
, delete argumentlevel
from methodset_out_sample_residuals
. -
In
ForecasterAutoregMultiSeries
,level
argument ofpredict
andpredict_interval
renamed tolevels
. -
In
backtesting_forecaster_multiseries
,level
argument ofpredict
andpredict_interval
renamed tolevels
. -
In
check_predict_input
function, argumentlevel
renamed tolevels
andseries_levels
renamed toseries_col_names
. -
In
backtesting_forecaster_multiseries
,metrics_levels
output is now a pandas DataFrame. -
In
grid_search_forecaster_multiseries
andrandom_search_forecaster_multiseries
, argumentlevels_weights
is deprecated since version 0.6.0, and will be removed in version 0.7.0. Useseries_weights
andweights_func
when creating the forecaster instead. -
Refactor
_create_lags_
inForecasterAutoreg
,ForecasterAutoregDirect
andForecasterAutoregMultiSeries
. (suggested by Bennett https://github.com/Bennett561) -
Refactor
backtesting_forecaster
andbacktesting_forecaster_multiseries
. -
In
ForecasterAutoregDirect
,filter_train_X_y_for_step
now starts at 1 (before 0). -
In
ForecasterAutoregDirect
, DataFramey_train
now start with 1,y_step_1
(beforey_step_0
). -
Remove
cv_forecaster
from modulemodel_selection
.
Fixed
-
In
ForecasterAutoregMultiSeries
, argumentlast_window
predict method now works when it is a pandas DataFrame. -
In
ForecasterAutoregMultiSeries
, fix bug transformers initialization.
[0.5.1] - [2022-10-05]¶
Added
-
Check that
exog
andy
have the same length in_evaluate_grid_hyperparameters
andbayesian_search_forecaster
to avoid fit exception whenreturn_best
. -
Check that
exog
andseries
have the same length in_evaluate_grid_hyperparameters_multiseries
to avoid fit exception whenreturn_best
.
Changed
- Argument
levels_list
ingrid_search_forecaster_multiseries
,random_search_forecaster_multiseries
and_evaluate_grid_hyperparameters_multiseries
renamed tolevels
.
Fixed
-
ForecasterAutoregMultiOutput
updated to matchForecasterAutoregDirect
. -
Fix Exception to raise when
level_weights
does not add up to a number close to 1.0 (before was exactly 1.0) ingrid_search_forecaster_multiseries
,random_search_forecaster_multiseries
and_evaluate_grid_hyperparameters_multiseries
. -
Create_train_X_y
inForecasterAutoregMultiSeries
now works when the forecaster is not fitted.
[0.5.0] - [2022-09-23]¶
Added
-
New arguments
transformer_y
(transformer_series
for multiseries) andtransformer_exog
in all forecaster classes. It is for transforming (scaling, max-min, ...) the modeled time series and exogenous variables inside the forecaster. -
Functions in utils
transform_series
andtransform_dataframe
to carry out the transformation of the modeled time series and exogenous variables. -
Functions
_backtesting_forecaster_verbose
,random_search_forecaster
,_evaluate_grid_hyperparameters
,bayesian_search_forecaster
,_bayesian_search_optuna
and_bayesian_search_skopt
in model_selection. -
Created
ForecasterAutoregMultiSeries
class for modeling multiple time series simultaneously. -
Created module
model_selection_multiseries
. Functions:_backtesting_forecaster_multiseries_refit
,_backtesting_forecaster_multiseries_no_refit
,backtesting_forecaster_multiseries
,grid_search_forecaster_multiseries
,random_search_forecaster_multiseries
and_evaluate_grid_hyperparameters_multiseries
. -
Function
_check_interval
in utils. (suggested by Thomas Karaouzene https://github.com/tkaraouzene) -
metric
can be a list inbacktesting_forecaster
,grid_search_forecaster
,random_search_forecaster
,backtesting_forecaster_multiseries
. Ifmetric
is alist
, multiple metrics will be calculated. (suggested by Pablo Dávila Herrero https://github.com/Pablo-Davila) -
Skforecast works with python 3.10.
-
Functions
save_forecaster
andload_forecaster
to module utils. -
get_feature_importance()
method checks if the forecast is fitted.
Changed
-
backtesting_forecaster
change default value of argumentfixed_train_size: bool=True
. -
Remove argument
set_out_sample_residuals
in functionbacktesting_forecaster
(deprecated since 0.4.2). -
backtesting_forecaster
verbose now includes fold size. -
grid_search_forecaster
results include the name of the used metric as column name. -
Remove
get_coef
method fromForecasterAutoreg
,ForecasterAutoregCustom
andForecasterAutoregMultiOutput
(deprecated since 0.4.3). -
_get_metric
now allowsmean_squared_log_error
. -
ForecasterAutoregMultiOutput
has been renamed toForecasterAutoregDirect
.ForecasterAutoregMultiOutput
will be removed in version 0.6.0. -
check_predict_input
updated to checkForecasterAutoregMultiSeries
inputs. -
set_out_sample_residuals
has a new argumenttransform
to transform the residuals before being stored.
Fixed
-
fit
now storeslast_window
values with len = forecaster.max_lag in ForecasterAutoreg and ForecasterAutoregCustom. -
in_sample_residuals
stored as apd.Series
whenlen(residuals) > 1000
.
[0.4.3] - [2022-03-18]¶
Added
-
Checks if all elements in lags are
int
when creating ForecasterAutoreg and ForecasterAutoregMultiOutput. -
Add
fixed_train_size: bool=False
argument tobacktesting_forecaster
andbacktesting_sarimax
Changed
-
Rename
get_metric
to_get_metric
. -
Functions in model_selection module allow custom metrics.
-
Functions in model_selection_statsmodels module allow custom metrics.
-
Change function
set_out_sample_residuals
(ForecasterAutoreg and ForecasterAutoregCustom),residuals
argument must be apandas Series
(wasnumpy ndarray
). -
Returned value of backtesting functions (model_selection and model_selection_statsmodels) is now a
float
(wasnumpy ndarray
). -
get_coef
andget_feature_importance
methods unified inget_feature_importance
.
Fixed
-
Requirements versions.
-
Method
fit
doesn't removeout_sample_residuals
each time the forecaster is fitted. -
Added random seed to residuals downsampling (ForecasterAutoreg and ForecasterAutoregCustom)
[0.4.2] - [2022-01-08]¶
Added
-
Increased verbosity of function
backtesting_forecaster()
. -
Random state argument in
backtesting_forecaster()
.
Changed
-
Function
backtesting_forecaster()
do not modify the original forecaster. -
Deprecated argument
set_out_sample_residuals
in functionbacktesting_forecaster()
. -
Function
model_selection.time_series_spliter
renamed tomodel_selection.time_series_splitter
Fixed
- Methods
get_coef
andget_feature_importance
ofForecasterAutoregMultiOutput
class return proper feature names.
[0.4.1] - [2021-12-13]¶
Added
Changed
Fixed
fit
andpredict
transform pandas series and dataframes to numpy arrays if regressor is XGBoost.
[0.4.0] - [2021-12-10]¶
Version 0.4 has undergone a huge code refactoring. Main changes are related to input-output formats (only pandas series and dataframes are allowed although internally numpy arrays are used for performance) and model validation methods (unified into backtesting with and without refit).
Added
ForecasterBase
as parent class
Changed
-
Argument
y
must be pandas Series. Numpy ndarrays are not allowed anymore. -
Argument
exog
must be pandas Series or pandas DataFrame. Numpy ndarrays are not allowed anymore. -
Output of
predict
is a pandas Series with index according to the steps predicted. -
Scikitlearn pipelines are allowed as regressors.
-
backtesting_forecaster
andbacktesting_forecaster_intervals
have been combined in a single function.- It is possible to backtest forecasters already trained.
ForecasterAutoregMultiOutput
allows incomplete folds.- It is possible to update
out_sample_residuals
with backtesting residuals.
-
cv_forecaster
has the option to updateout_sample_residuals
with backtesting residuals. -
backtesting_sarimax_statsmodels
andcv_sarimax_statsmodels
have been combined in a single function. -
gridsearch_forecaster
use backtesting as validation strategy with the option of refit. -
Extended information when printing
Forecaster
object. -
All static methods for checking and preprocessing inputs moved to module utils.
-
Remove deprecated class
ForecasterCustom
.
Fixed
[0.3.0] - [2021-09-01]¶
Added
-
New module model_selection_statsmodels to cross-validate, backtesting and grid search AutoReg and SARIMAX models from statsmodels library:
backtesting_autoreg_statsmodels
cv_autoreg_statsmodels
backtesting_sarimax_statsmodels
cv_sarimax_statsmodels
grid_search_sarimax_statsmodels
-
Added attribute window_size to
ForecasterAutoreg
andForecasterAutoregCustom
. It is equal tomax_lag
.
Changed
cv_forecaster
returns cross-validation metrics and cross-validation predictions.- Added an extra column for each parameter in the dataframe returned by
grid_search_forecaster
. - statsmodels 0.12.2 added to requirements
Fixed
[0.2.0] - [2021-08-26]¶
Added
-
Multiple exogenous variables can be passed as pandas DataFrame.
-
Documentation at https://joaquinamatrodrigo.github.io/skforecast/
-
New unit test
-
Increased typing
Changed
- New implementation of
ForecasterAutoregMultiOutput
. The training process in the new version creates a different X_train for each step. See Direct multi-step forecasting for more details. Old versión can be acces withskforecast.deprecated.ForecasterAutoregMultiOutput
.
Fixed
[0.1.9] - [2021-07-27]¶
Added
-
Logging total number of models to fit in
grid_search_forecaster
. -
Class
ForecasterAutoregCustom
. -
Method
create_train_X_y
to facilitate access to the training data matrix created fromy
andexog
.
Changed
-
New implementation of
ForecasterAutoregMultiOutput
. The training process in the new version creates a different X_train for each step. See Direct multi-step forecasting for more details. Old versión can be acces withskforecast.deprecated.ForecasterAutoregMultiOutput
. -
Class
ForecasterCustom
has been renamed toForecasterAutoregCustom
. However,ForecasterCustom
will still remain to keep backward compatibility. -
Argument
metric
incv_forecaster
,backtesting_forecaster
,grid_search_forecaster
andbacktesting_forecaster_intervals
changed from 'neg_mean_squared_error', 'neg_mean_absolute_error', 'neg_mean_absolute_percentage_error' to 'mean_squared_error', 'mean_absolute_error', 'mean_absolute_percentage_error'. -
Check if argument
metric
incv_forecaster
,backtesting_forecaster
,grid_search_forecaster
andbacktesting_forecaster_intervals
is one of 'mean_squared_error', 'mean_absolute_error', 'mean_absolute_percentage_error'. -
time_series_spliter
doesn't include the remaining observations in the last complete fold but in a new one whenallow_incomplete_fold=True
. Take in consideration that incomplete folds with few observations could overestimate or underestimate the validation metric.
Fixed
- Update lags of
ForecasterAutoregMultiOutput
aftergrid_search_forecaster
.
[0.1.8.1] - [2021-05-17]¶
Added
set_out_sample_residuals
method to store or update out of sample residuals used bypredict_interval
.
Changed
-
backtesting_forecaster_intervals
andbacktesting_forecaster
print number of steps per fold. -
Only stored up to 1000 residuals.
-
Improved verbose in
backtesting_forecaster_intervals
.
Fixed
-
Warning of inclompleted folds when using
backtesting_forecast
with aForecasterAutoregMultiOutput
. -
ForecasterAutoregMultiOutput.predict
allow exog data longer than needed (steps). -
backtesting_forecast
prints correctly the number of folds when remainder observations are cero. -
Removed named argument X in
self.regressor.predict(X)
to allow using XGBoost regressor. -
Values stored in
self.last_window
when trainingForecasterAutoregMultiOutput
.
[0.1.8] - [2021-04-02]¶
Added
- Class
ForecasterAutoregMultiOutput.py
: forecaster with direct multi-step predictions. - Method
ForecasterCustom.predict_interval
andForecasterAutoreg.predict_interval
: estimate prediction interval using bootstrapping. skforecast.model_selection.backtesting_forecaster_intervals
perform backtesting and return prediction intervals.
Changed
Fixed
[0.1.7] - [2021-03-19]¶
Added
- Class
ForecasterCustom
: same functionalities asForecasterAutoreg
but allows custom definition of predictors.
Changed
grid_search forecaster
adapted to work with objectsForecasterCustom
in addition toForecasterAutoreg
.
Fixed
[0.1.6] - [2021-03-14]¶
Added
- Method
get_feature_importances
toskforecast.ForecasterAutoreg
. - Added backtesting strategy in
grid_search_forecaster
. - Added
backtesting_forecast
toskforecast.model_selection
.
Changed
- Method
create_lags
return a matrix where the order of columns match the ascending order of lags. For example, column 0 contains the values of the minimum lag used as predictor. - Renamed argument
X
tolast_window
in methodpredict
. - Renamed
ts_cv_forecaster
tocv_forecaster
.
Fixed
[0.1.4] - [2021-02-15]¶
Added
- Method
get_coef
toskforecast.ForecasterAutoreg
.
Changed
Fixed