Backtesting¶
In time series forecasting, backtesting refers to the process of validating a predictive model using historical data. The technique involves moving backwards in time, step-by-step, to assess how well a model would have performed if it had been used to make predictions during that time period. Backtesting is a form of cross-validation that is applied to previous periods in the time series.
The purpose of backtesting is to evaluate the accuracy and effectiveness of a model and identify any potential issues or areas of improvement. By testing the model on historical data, one can assess how well it performs on data that it has not seen before. This is an important step in the modeling process, as it helps to ensure that the model is robust and reliable.
Backtesting can be done using a variety of techniques, such as simple train-test splits or more sophisticated methods like rolling windows or expanding windows. The choice of method depends on the specific needs of the analysis and the characteristics of the time series data.
Overall, backtesting is an essential step in the development of a time series forecasting model. By rigorously testing the model on historical data, one can improve its accuracy and ensure that it is effective at predicting future values of the time series.
Backtesting with refit and increasing training size (fixed origin)¶
In this strategy, the model is trained before making predictions each time, and all available data up to that point is used in the training process. This differs from standard cross-validation, where the data is randomly distributed between training and validation sets.
Instead of randomizing the data, this backtesting sequentially increases the size of the training set while maintaining the temporal order of the data. By doing this, the model can be tested on progressively larger amounts of historical data, providing a more accurate assessment of its predictive capabilities.
Backtesting with refit and fixed training size (rolling origin)¶
In this approach, the model is trained using a fixed window of past observations, and the testing is performed on a rolling basis, where the training window is moved forward in time. The size of the training window is kept constant, allowing for the model to be tested on different sections of the data. This technique is particularly useful when there is a limited amount of data available, or when the data is non-stationary, and the model's performance may vary over time. Is also known as time series cross-validation or walk-forward validation.
Backtesting without refit¶
Backtesting without refit is a strategy where the model is trained only once and used sequentially without updating it, following the temporal order of the data. This approach is advantageous as it is much faster than other methods that require retraining the model each time. However, the model may lose its predictive power over time as it does not incorporate the latest information available.
Backtesting including gap¶
This approach introduces a time gap between the training and test sets, replicating a scenario where predictions cannot be made immediately after the end of the training data.
For example, consider the goal of predicting the 24 hours of day D+1, but the predictions need to be made at 11:00 to allow sufficient flexibility. At 11:00 on day D, the task is to forecast hours [12 - 23] of the same day and hours [0 - 23] of day D+1. Thus, a total of 36 hours into the future must be predicted, with only the last 24 hours to be stored.
Libraries¶
# Libraries
# ==============================================================================
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from skforecast.ForecasterAutoreg import ForecasterAutoreg
from skforecast.model_selection import backtesting_forecaster
from sklearn.linear_model import Ridge
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error
Data¶
# Download data
# ==============================================================================
url = (
'https://raw.githubusercontent.com/JoaquinAmatRodrigo/skforecast/master/'
'data/h2o.csv'
)
data = pd.read_csv(url, sep=',', header=0, names=['y', 'datetime'])
# Data preprocessing
# ==============================================================================
data['datetime'] = pd.to_datetime(data['datetime'], format='%Y-%m-%d')
data = data.set_index('datetime')
data = data.asfreq('MS')
data = data[['y']]
data = data.sort_index()
# Train-validation dates
# ==============================================================================
end_train = '2002-01-01 23:59:00'
print(
f"Train dates : {data.index.min()} --- {data.loc[:end_train].index.max()}"
f" (n={len(data.loc[:end_train])})"
)
print(
f"Validation dates : {data.loc[end_train:].index.min()} --- {data.index.max()}"
f" (n={len(data.loc[end_train:])})"
)
# Plot
# ==============================================================================
fig, ax = plt.subplots(figsize=(6, 3))
data.loc[:end_train].plot(ax=ax, label='train')
data.loc[end_train:].plot(ax=ax, label='validation')
ax.legend()
plt.show()
display(data.head(4))
Train dates : 1991-07-01 00:00:00 --- 2002-01-01 00:00:00 (n=127) Validation dates : 2002-02-01 00:00:00 --- 2008-06-01 00:00:00 (n=77)
y | |
---|---|
datetime | |
1991-07-01 | 0.429795 |
1991-08-01 | 0.400906 |
1991-09-01 | 0.432159 |
1991-10-01 | 0.492543 |
Backtest¶
An example of a backtesting with refit consists of the following steps:
Train the model using an initial training set, the length of which is specified by
initial_train_size
.Once the model is trained, it is used to make predictions for the next 10 steps (
steps=10
) in the data. These predictions are saved for later evaluation.As
refit
is set toTrue
, the size of the training set is increased by adding the lats 10 data points (the previously predicted 10 steps), while the next 10 steps are used as test data.After expanding the training set, the model is retrained using the updated training data and then used to predict the next 10 steps.
Repeat steps 3 and 4 until the entire series has been tested.
By following these steps, you can ensure that the model is evaluated on multiple sets of test data, thereby providing a more accurate assessment of its predictive power.
# Backtesting forecaster
# ==============================================================================
forecaster = ForecasterAutoreg(
regressor = RandomForestRegressor(random_state=123),
lags = 15
)
metric, predictions = backtesting_forecaster(
forecaster = forecaster,
y = data['y'],
steps = 10,
metric = 'mean_squared_error',
initial_train_size = len(data.loc[:end_train]),
fixed_train_size = False,
gap = 0,
allow_incomplete_fold = True,
refit = True,
verbose = True,
show_progress = True
)
Information of backtesting process ---------------------------------- Number of observations used for initial training: 127 Number of observations used for backtesting: 77 Number of folds: 8 Number of steps per fold: 10 Number of steps to exclude from the end of each train set before test (gap): 0 Last fold only includes 7 observations. Fold: 0 Training: 1991-07-01 00:00:00 -- 2002-01-01 00:00:00 (n=127) Validation: 2002-02-01 00:00:00 -- 2002-11-01 00:00:00 (n=10) Fold: 1 Training: 1991-07-01 00:00:00 -- 2002-11-01 00:00:00 (n=137) Validation: 2002-12-01 00:00:00 -- 2003-09-01 00:00:00 (n=10) Fold: 2 Training: 1991-07-01 00:00:00 -- 2003-09-01 00:00:00 (n=147) Validation: 2003-10-01 00:00:00 -- 2004-07-01 00:00:00 (n=10) Fold: 3 Training: 1991-07-01 00:00:00 -- 2004-07-01 00:00:00 (n=157) Validation: 2004-08-01 00:00:00 -- 2005-05-01 00:00:00 (n=10) Fold: 4 Training: 1991-07-01 00:00:00 -- 2005-05-01 00:00:00 (n=167) Validation: 2005-06-01 00:00:00 -- 2006-03-01 00:00:00 (n=10) Fold: 5 Training: 1991-07-01 00:00:00 -- 2006-03-01 00:00:00 (n=177) Validation: 2006-04-01 00:00:00 -- 2007-01-01 00:00:00 (n=10) Fold: 6 Training: 1991-07-01 00:00:00 -- 2007-01-01 00:00:00 (n=187) Validation: 2007-02-01 00:00:00 -- 2007-11-01 00:00:00 (n=10) Fold: 7 Training: 1991-07-01 00:00:00 -- 2007-11-01 00:00:00 (n=197) Validation: 2007-12-01 00:00:00 -- 2008-06-01 00:00:00 (n=7)
0%| | 0/8 [00:00<?, ?it/s]
print(f"Backtest error: {metric}")
predictions.head(4)
Backtest error: 0.00818535931502708
pred | |
---|---|
2002-02-01 | 0.594506 |
2002-03-01 | 0.785886 |
2002-04-01 | 0.698925 |
2002-05-01 | 0.790560 |
# Plot predictions
# ==============================================================================
fig, ax = plt.subplots(figsize=(6, 3))
data.loc[end_train:, 'y'].plot(ax=ax)
predictions.plot(ax=ax)
ax.legend();
Backtesting with gap¶
The gap size can be adjusted with the gap
argument. In addition, the allow_incomplete_fold
parameter allows the last fold to be excluded from the analysis if it doesn't have the same size as the required number of steps
.
These arguments can be used in conjunction with either refit set to True
or False
, depending on the needs and objectives of the use case.