Probabilistic Forecasting: Global Models¶
Skforecast allows to apply all its implemented probabilistic forecasting methods (bootstrapping, conformal prediction and quantile regression) to global models. This means that the model is trained with all the available time series and the forecast is made for all the time series.
For detailed information about the available probabilistic forecasting methods, see the following user guides:
💡 Tip
When predicting multiple series, scaling to hundreds or thousands of series, computational time can be a bottleneck. For these cases, using the conformal framework can be a good option, as it is faster than the bootstrapping and quantile regression methods.
In case of using bootstrapping, note that if use_binned_residuals
is set to True
, the computational time will be even higher.
Libraries and data¶
# Data processing
# ==============================================================================
import numpy as np
import pandas as pd
from skforecast.datasets import fetch_dataset
# Plots
# ==============================================================================
import matplotlib.pyplot as plt
from skforecast.plot import set_dark_theme, plot_prediction_intervals
from statsmodels.graphics.tsaplots import plot_pacf
from itertools import cycle
# Modelling and Forecasting
# ==============================================================================
from lightgbm import LGBMRegressor
from skforecast.recursive import ForecasterRecursiveMultiSeries
from skforecast.model_selection import TimeSeriesFold, backtesting_forecaster_multiseries
from skforecast.metrics import calculate_coverage
from skforecast.plot import calculate_lag_autocorrelation
from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import ColumnTransformer
# Configuration
# ==============================================================================
import warnings
warnings.filterwarnings('once')
# Data
# ==============================================================================
data = fetch_dataset(name="bdg2_hourly_sample")
data = data.loc['2016-01-01 00:00:00' : '2016-09-30 00:00:00', :]
data.head(2)
bdg2_hourly_sample ------------------ Daily energy consumption data of two buildings sampled from the The Building Data Genome Project 2. https://github.com/buds-lab/building-data-genome- project-2 Miller, C., Kathirgamanathan, A., Picchetti, B. et al. The Building Data Genome Project 2, energy meter data from the ASHRAE Great Energy Predictor III competition. Sci Data 7, 368 (2020). https://doi.org/10.1038/s41597-020-00712-x Shape of the dataset: (6553, 2)
building_1 | building_2 | |
---|---|---|
timestamp | ||
2016-01-01 00:00:00 | 186.532 | 219.27 |
2016-01-01 01:00:00 | 186.532 | 219.27 |
# Calendar features
# ==============================================================================
data['day_of_week'] = data.index.dayofweek
data['hour'] = data.index.hour
transformer = ColumnTransformer(
transformers=[
('one_hot_encoder', OneHotEncoder(sparse_output=False), ['day_of_week', 'hour'])
],
remainder='passthrough',
verbose_feature_names_out=False
).set_output(transform='pandas')
data = transformer.fit_transform(data)
data.head()
day_of_week_0 | day_of_week_1 | day_of_week_2 | day_of_week_3 | day_of_week_4 | day_of_week_5 | day_of_week_6 | hour_0 | hour_1 | hour_2 | ... | hour_16 | hour_17 | hour_18 | hour_19 | hour_20 | hour_21 | hour_22 | hour_23 | building_1 | building_2 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
timestamp | |||||||||||||||||||||
2016-01-01 00:00:00 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 186.532 | 219.270 |
2016-01-01 01:00:00 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 186.532 | 219.270 |
2016-01-01 02:00:00 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 183.479 | 218.348 |
2016-01-01 03:00:00 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 184.303 | 220.003 |
2016-01-01 04:00:00 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 183.657 | 217.731 |
5 rows × 33 columns
# Plot time series
# ==============================================================================
set_dark_theme()
plt.rcParams['lines.linewidth'] = 0.5
colors = cycle(plt.rcParams['axes.prop_cycle'].by_key()['color'])
fig, axs = plt.subplots(2, 1, figsize=(7, 4), sharex=True)
series = ['building_1', 'building_2']
for i, col in enumerate(series):
axs[i].plot(data[col], label=col, color=next(colors))
axs[i].legend(loc='lower left', fontsize=8)
axs[i].tick_params(axis='both', labelsize=8)
plt.tight_layout()
# Partial autocorrelation values and plots
# ==============================================================================
n_lags = 7 * 24
pacf_df = []
fig, axs = plt.subplots(2, 1, figsize=(6, 3))
axs = axs.ravel()
for i, col in enumerate(series):
pacf_values = calculate_lag_autocorrelation(data[col], n_lags=n_lags)
pacf_values['variable'] = col
pacf_df.append(pacf_values)
plot_pacf(data[col], lags=n_lags, ax=axs[i])
axs[i].set_title(col, fontsize=10)
axs[i].set_ylim(-0.5, 1.1)
plt.tight_layout()
# Top n lags with highest absolute partial autocorrelation per variable
# ==============================================================================
n = 10
top_lags = set()
for pacf_values in pacf_df:
variable = pacf_values['variable'].iloc[0]
lags = pacf_values.nlargest(n, 'partial_autocorrelation_abs')['lag'].sort_values().tolist()
top_lags.update(lags)
print(f"{variable}: {lags}")
top_lags = list(top_lags)
top_lags.sort()
print(f"\nAll lags: {top_lags}")
building_1: [1, 2, 3, 15, 16, 19, 24, 25, 136, 145] building_2: [1, 2, 3, 20, 21, 22, 25, 26, 145, 166] All lags: [1, 2, 3, 15, 16, 19, 20, 21, 22, 24, 25, 26, 136, 145, 166]
The target series exhibit similar dynamics with several lagged correlations that can be used as predictors.
# Split train-validation-test
# ==============================================================================
end_train = '2016-07-01 23:59:00'
end_validation = '2016-09-01 23:59:00'
data_train = data.loc[: end_train, :]
data_val = data.loc[end_train:end_validation, :]
data_test = data.loc[end_validation:, :]
print(f"Dates train : {data_train.index.min()} --- {data_train.index.max()} (n={len(data_train)})")
print(f"Dates validation : {data_val.index.min()} --- {data_val.index.max()} (n={len(data_val)})")
print(f"Dates test : {data_test.index.min()} --- {data_test.index.max()} (n={len(data_test)})")
Dates train : 2016-01-01 00:00:00 --- 2016-07-01 23:00:00 (n=4392) Dates validation : 2016-07-02 00:00:00 --- 2016-09-01 23:00:00 (n=1488) Dates test : 2016-09-02 00:00:00 --- 2016-09-30 00:00:00 (n=673)
# Plot partitions
# ==============================================================================
colors = plt.rcParams['axes.prop_cycle'].by_key()['color']
fig, axs = plt.subplots(2, 1, figsize=(7, 5), sharex=True)
for i, col in enumerate(series):
axs[i].plot(data[col], label=col, color=colors[i])
axs[i].legend(loc='lower left', fontsize=8)
axs[i].tick_params(axis='both', labelsize=8)
axs[i].axvline(pd.to_datetime(end_train), color='white', linestyle='--', linewidth=1) # End train
axs[i].axvline(pd.to_datetime(end_validation), color='white', linestyle='--', linewidth=1) # End validation
plt.tight_layout()
# Create forecaster
# ==============================================================================
exog_features = data.columns[data.columns.str.contains('day_of_|hour_')].tolist()
params = {
"n_estimators": 300,
"learning_rate": 0.05,
"max_depth": 5,
}
forecaster = ForecasterRecursiveMultiSeries(
regressor = LGBMRegressor(random_state=15926, verbose=-1, **params),
lags = top_lags,
encoding = 'ordinal',
)
# Backtesting on validation data to obtain out-sample residuals
# ==============================================================================
cv = TimeSeriesFold(
initial_train_size = len(data.loc[:end_train, :]),
steps = 24,
)
metric_val, predictions_val = backtesting_forecaster_multiseries(
forecaster = forecaster,
series = data.loc[:end_validation, series],
exog = data.loc[:end_validation, exog_features],
cv = cv,
metric = 'mean_absolute_error'
)
0%| | 0/62 [00:00<?, ?it/s]
# Plot predictions on validation data
# ==============================================================================
fig, axs = plt.subplots(2, 1, figsize=(7, 5.5), sharex=True, sharey=False)
for i, level in enumerate(predictions_val['level'].unique()):
predictions_val_level = predictions_val.loc[predictions_val['level'] == level, 'pred']
data.loc[end_train:end_validation, level].plot(ax=axs[i], label='Real value')
predictions_val_level.plot(ax=axs[i], label='prediction')
axs[i].set_xlabel("")
axs[i].set_title(level)
axs[i].legend(loc='lower right', fontsize=7)
plt.tight_layout()
# Out-sample residuals distribution
# ==============================================================================
fig, axs = plt.subplots(2, 1, figsize=(7, 4), sharex=True, sharey=False)
for i, level in enumerate(predictions_val["level"].unique()):
residuals = (
data.loc[predictions_val.index, level]
- predictions_val.loc[predictions_val["level"] == level, "pred"]
)
residuals.plot(ax=axs[i], label=f"Residuals {level}")
axs[i].legend(loc="upper right", fontsize=7)
fig.tight_layout()
# Store out-sample residuals in the forecaster
# ==============================================================================
forecaster.fit(
series = data.loc[:end_train, series],
exog = data.loc[:end_train, exog_features]
)
forecaster.set_out_sample_residuals(
y_true = {k: data.loc[predictions_val.index.unique(), k] for k in series},
y_pred = {k: v for k, v in predictions_val.groupby('level')['pred']}
)
✎ Note
When new residuals are stored in the forecaster, the residuals are binned according to the size of the predictions to which they correspond. Later, in the bootstrapping process, the residuals are sampled from the appropriate bin to ensure that the distribution of the residuals is consistent with the predictions. This process allows for more accurate prediction intervals because the residuals are more closely aligned with the predictions to which they correspond, resulting in better coverage with narrower intervals. For more information about how residuals are used in interval estimation visit Probabilistic forecasting: Bootstrapped Residuals.
# Intervals of the residual bins (conditioned on predicted values) for each level
# ==============================================================================
from pprint import pprint
for k, v in forecaster.binner_intervals_.items():
print(k)
pprint(v)
print("")
building_1 {0: (162.55654558304457, 188.36664987311164), 1: (188.36664987311164, 192.69194794958057), 2: (192.69194794958057, 196.33152734653578), 3: (196.33152734653578, 200.63335799466273), 4: (200.63335799466273, 205.69014402995123), 5: (205.69014402995123, 212.73488176715347), 6: (212.73488176715347, 223.7347135798683), 7: (223.7347135798683, 247.49425709088368), 8: (247.49425709088368, 257.45057166569234), 9: (257.45057166569234, 292.2588224363378)} building_2 {0: (191.2489257150324, 221.78927964090707), 1: (221.78927964090707, 227.7092976710155), 2: (227.7092976710155, 231.21030346461032), 3: (231.21030346461032, 234.22923980033235), 4: (234.22923980033235, 237.49248298694857), 5: (237.49248298694857, 241.4080869629871), 6: (241.4080869629871, 260.62031922270853), 7: (260.62031922270853, 288.5886870651536), 8: (288.5886870651536, 299.908273995279), 9: (299.908273995279, 317.16060597117587)} _unknown_level {0: (162.55654558304457, 192.66325275631186), 1: (192.66325275631186, 200.60633333417312), 2: (200.60633333417312, 211.4832393790082), 3: (211.4832393790082, 222.46785614535224), 4: (222.46785614535224, 230.27647724100552), 5: (230.27647724100552, 235.8292109781876), 6: (235.8292109781876, 242.7889817619577), 7: (242.7889817619577, 258.04911594281583), 8: (258.04911594281583, 288.66360939645296), 9: (288.66360939645296, 317.16060597117587)}
# Distribution of the residual by bin and level
# ==============================================================================
fig, axs = plt.subplots(2, 2, figsize=(7, 5), sharex=True, sharey=True)
axs = axs.ravel()
for i, level in enumerate(forecaster.out_sample_residuals_by_bin_):
out_sample_residuals_by_bin_df = pd.DataFrame(
dict(
[(k, pd.Series(v))
for k, v in forecaster.out_sample_residuals_by_bin_[level].items()]
)
)
flierprops = dict(marker='o', markerfacecolor='gray', markersize=5, linestyle='none')
out_sample_residuals_by_bin_df.boxplot(ax=axs[i], flierprops=flierprops)
axs[i].set_title(level, fontsize=10)
fig.suptitle("Distribution of residuals by bin", fontsize=12)
fig.tight_layout()
✎ Note
Two arguments control the use of residuals in predict()
and backtesting_forecaster()
:
use_in_sample_residuals
: IfTrue
, the in-sample residuals are used to compute the prediction intervals. Since these residuals are obtained from the training set, they are always available, but usually lead to overoptimistic intervals. IfFalse
, the out-sample residuals (calibration) are used to calculate the prediction intervals. These residuals are obtained from the validation/calibration set and are only available if theset_out_sample_residuals()
method has been called. It is recommended to use out-sample residuals to achieve the desired coverage.use_binned_residuals
: IfFalse
, a single correction factor is applied to all predictions during conformalization. IfTrue
, the conformalization process uses a correction factor that depends on the bin where the prediction falls. This can be thought as a type of Adaptive Conformal Predictions and it can lead to more accurate prediction intervals since the correction factor is conditioned on the region of the prediction space.
Intervals using conformal prediction¶
# Backtesting with conformal prediction intervals
# ==============================================================================
cv = TimeSeriesFold(
initial_train_size = len(data.loc[:end_validation, :]),
steps = 24,
)
metric, predictions_test = backtesting_forecaster_multiseries(
forecaster = forecaster,
series = data[series],
exog = data[exog_features],
cv = cv,
metric = 'mean_absolute_error',
interval = [10, 90],
interval_method = "conformal",
use_in_sample_residuals = False, # Use out-sample residuals
use_binned_residuals = True # Adaptative intervals
)
predictions_test
0%| | 0/29 [00:00<?, ?it/s]
level | pred | lower_bound | upper_bound | |
---|---|---|---|---|
2016-09-02 00:00:00 | building_1 | 198.654360 | 189.636937 | 207.671784 |
2016-09-02 00:00:00 | building_2 | 204.937493 | 195.752042 | 214.122945 |
2016-09-02 01:00:00 | building_1 | 195.443918 | 186.426494 | 204.461342 |
2016-09-02 01:00:00 | building_2 | 202.337314 | 193.151862 | 211.522766 |
2016-09-02 02:00:00 | building_1 | 192.120380 | 184.568154 | 199.672607 |
... | ... | ... | ... | ... |
2016-09-29 22:00:00 | building_2 | 214.340279 | 205.154827 | 223.525730 |
2016-09-29 23:00:00 | building_1 | 203.202476 | 193.458915 | 212.946037 |
2016-09-29 23:00:00 | building_2 | 211.105410 | 201.919958 | 220.290861 |
2016-09-30 00:00:00 | building_1 | 200.085366 | 190.341805 | 209.828928 |
2016-09-30 00:00:00 | building_2 | 204.815233 | 195.629781 | 214.000684 |
1346 rows × 4 columns
# Plot intervals and calculate coverage for each level
# ==============================================================================
for level in predictions_test['level'].unique():
print(f"\nLevel: {level}")
predictions_level = (
predictions_test[predictions_test["level"] == level]
.drop(columns="level")
.copy()
)
# Plot intervals
plt.rcParams['lines.linewidth'] = 1
fig, ax = plt.subplots(figsize=(7, 3))
plot_prediction_intervals(
predictions = predictions_level,
y_true = data_test[[level]],
target_variable = level,
initial_x_zoom = None,
title = "Prediction intervals",
xaxis_title = "",
yaxis_title = level,
ax = ax
)
plt.gca().legend(loc='upper left')
fill_between_obj = ax.collections[0]
fill_between_obj.set_facecolor('white')
fill_between_obj.set_alpha(0.3)
# Predicted interval coverage (on test data)
coverage = calculate_coverage(
y_true = data_test[level],
lower_bound = predictions_level["lower_bound"],
upper_bound = predictions_level["upper_bound"]
)
print(f"Predicted interval coverage: {round(100 * coverage, 2)} %")
# Area of the interval
area = (predictions_level["upper_bound"] - predictions_level["lower_bound"]).sum()
print(f"Area of the interval: {round(area, 2)}")
Level: building_1 Predicted interval coverage: 83.8 % Area of the interval: 12929.54 Level: building_2 Predicted interval coverage: 78.75 % Area of the interval: 13614.4
Intervals using bootstrapped residuals¶
The same process is repeated but this time using the conformal method instead of bootstrapping.
# Backtesting with prediction intervals using bootstrapping
# ==============================================================================
cv = TimeSeriesFold(
initial_train_size = len(data.loc[:end_validation, :]),
steps = 24,
)
metric, predictions_test = backtesting_forecaster_multiseries(
forecaster = forecaster,
series = data[series],
exog = data[exog_features],
cv = cv,
metric = 'mean_absolute_error',
interval = [10, 90],
interval_method = "bootstrapping",
n_boot = 150,
use_in_sample_residuals = False, # Use out-sample residuals
use_binned_residuals = True # Residuals conditioned on predicted values
)
predictions_test
0%| | 0/29 [00:00<?, ?it/s]
level | pred | lower_bound | upper_bound | |
---|---|---|---|---|
2016-09-02 00:00:00 | building_1 | 198.654360 | 193.703464 | 209.075452 |
2016-09-02 00:00:00 | building_2 | 204.937493 | 194.825202 | 212.585027 |
2016-09-02 01:00:00 | building_1 | 195.443918 | 186.221740 | 209.323422 |
2016-09-02 01:00:00 | building_2 | 202.337314 | 188.617664 | 213.909010 |
2016-09-02 02:00:00 | building_1 | 192.120380 | 179.659126 | 206.835901 |
... | ... | ... | ... | ... |
2016-09-29 22:00:00 | building_2 | 214.340279 | 198.380518 | 237.528369 |
2016-09-29 23:00:00 | building_1 | 203.202476 | 192.659555 | 258.594803 |
2016-09-29 23:00:00 | building_2 | 211.105410 | 193.229093 | 238.896552 |
2016-09-30 00:00:00 | building_1 | 200.085366 | 188.294833 | 208.383293 |
2016-09-30 00:00:00 | building_2 | 204.815233 | 193.892343 | 214.680595 |
1346 rows × 4 columns
# Plot intervals and calculate coverage for each level
# ==============================================================================
for level in predictions_test["level"].unique():
print(f"\nLevel: {level}")
predictions_level = (
predictions_test[predictions_test["level"] == level]
.drop(columns="level")
.copy()
)
# Plot intervals
plt.rcParams["lines.linewidth"] = 1
fig, ax = plt.subplots(figsize=(7, 3))
plot_prediction_intervals(
predictions = predictions_level,
y_true = data_test[[level]],
target_variable = level,
initial_x_zoom = None,
title = "Prediction intervals",
xaxis_title = "",
yaxis_title = level,
ax = ax
)
plt.gca().legend(loc="upper left", fontsize=7)
fill_between_obj = ax.collections[0]
fill_between_obj.set_facecolor("white")
fill_between_obj.set_alpha(0.3)
# Predicted interval coverage (on test data)
coverage = calculate_coverage(
y_true = data_test[level],
lower_bound = predictions_level["lower_bound"],
upper_bound = predictions_level["upper_bound"]
)
print(f"Predicted interval coverage: {round(100 * coverage, 2)} %")
# Area of the interval
area = (predictions_level["upper_bound"] - predictions_level["lower_bound"]).sum()
print(f"Area of the interval: {round(area, 2)}")
Level: building_1 Predicted interval coverage: 93.02 % Area of the interval: 32018.44 Level: building_2 Predicted interval coverage: 95.69 % Area of the interval: 29612.23
The resulting intervals are overly conservative, they have an empirical coverage higher than expected (80%).
Forecasting intervals for unknown series¶
The ForecasterRecursiveMultiseries
class allows the forecasting of series not seen during training, with the only requirement that at least window_size
observations of the new series are available. To generate probabilistic predictions, the forecaster uses a random sample of the residuals of all known series.
# Predictions for an known series
# ==============================================================================
# Simulate last_window of the new series
len_last_window = forecaster.max_lag
last_window = pd.DataFrame({
'new_series': data_val['building_1'].iloc[-len_last_window:] + np.random.normal(0, 0.1, len_last_window)
})
forecaster.predict_interval(
steps = 24,
last_window = last_window,
exog = data_test[exog_features],
method = 'conformal',
interval = [10, 90],
use_in_sample_residuals = False
)
╭──────────────────────────────── UnknownLevelWarning ─────────────────────────────────╮ │ `levels` {'new_series'} were not included in training. Unknown levels are encoded as │ │ NaN, which may cause the prediction to fail if the regressor does not accept NaN │ │ values. │ │ │ │ Category : UnknownLevelWarning │ │ Location : │ │ /home/ubuntu/anaconda3/envs/skforecast_15_py12/lib/python3.12/site-packages/skforeca │ │ st/utils/utils.py:899 │ │ Suppress : warnings.simplefilter('ignore', category=UnknownLevelWarning) │ ╰──────────────────────────────────────────────────────────────────────────────────────╯
╭──────────────────────────────── UnknownLevelWarning ─────────────────────────────────╮ │ `levels` {'new_series'} are not present in │ │ `forecaster.out_sample_residuals_by_bin_`. A random sample of the residuals from │ │ other levels will be used. This can lead to inaccurate intervals for the unknown │ │ levels. Otherwise, Use the `set_out_sample_residuals()` method before predicting to │ │ set the residuals for these levels. │ │ │ │ Category : UnknownLevelWarning │ │ Location : │ │ /home/ubuntu/anaconda3/envs/skforecast_15_py12/lib/python3.12/site-packages/skforeca │ │ st/utils/utils.py:1320 │ │ Suppress : warnings.simplefilter('ignore', category=UnknownLevelWarning) │ ╰──────────────────────────────────────────────────────────────────────────────────────╯
level | pred | lower_bound | upper_bound | |
---|---|---|---|---|
2016-09-02 00:00:00 | new_series | 199.295213 | 190.663831 | 207.926595 |
2016-09-02 01:00:00 | new_series | 196.635219 | 188.003837 | 205.266601 |
2016-09-02 02:00:00 | new_series | 193.146228 | 184.514846 | 201.777610 |
2016-09-02 03:00:00 | new_series | 190.924593 | 182.826129 | 199.023058 |
2016-09-02 04:00:00 | new_series | 190.758139 | 182.659675 | 198.856604 |
2016-09-02 05:00:00 | new_series | 190.190634 | 182.092170 | 198.289099 |
2016-09-02 06:00:00 | new_series | 191.960193 | 183.861728 | 200.058657 |
2016-09-02 07:00:00 | new_series | 200.684158 | 191.357070 | 210.011246 |
2016-09-02 08:00:00 | new_series | 220.888254 | 211.254333 | 230.522175 |
2016-09-02 09:00:00 | new_series | 239.945580 | 225.287244 | 254.603916 |
2016-09-02 10:00:00 | new_series | 248.945741 | 234.623388 | 263.268094 |
2016-09-02 11:00:00 | new_series | 253.711857 | 239.389503 | 268.034210 |
2016-09-02 12:00:00 | new_series | 254.023096 | 239.700742 | 268.345449 |
2016-09-02 13:00:00 | new_series | 254.873665 | 240.551312 | 269.196018 |
2016-09-02 14:00:00 | new_series | 255.210915 | 240.888562 | 269.533269 |
2016-09-02 15:00:00 | new_series | 252.259916 | 237.937562 | 266.582269 |
2016-09-02 16:00:00 | new_series | 248.133738 | 233.811385 | 262.456091 |
2016-09-02 17:00:00 | new_series | 237.257947 | 222.599611 | 251.916283 |
2016-09-02 18:00:00 | new_series | 226.574851 | 216.331651 | 236.818052 |
2016-09-02 19:00:00 | new_series | 215.773805 | 206.139884 | 225.407726 |
2016-09-02 20:00:00 | new_series | 211.473445 | 202.146356 | 220.800533 |
2016-09-02 21:00:00 | new_series | 210.196937 | 200.869848 | 219.524025 |
2016-09-02 22:00:00 | new_series | 207.999901 | 198.672813 | 217.326990 |
2016-09-02 23:00:00 | new_series | 204.757490 | 195.430402 | 214.084578 |
Probabilistic forecasting in production¶
⚠ Warning
The correct estimation of prediction intervals with conformal methods depends on the residuals being representative of future errors. For this reason, calibration residuals should be used. However, the dynamics of the series and models can change over time, so it is important to monitor and regularly update the residuals. It can be done easily using the set_out_sample_residuals()
method.