Probabilistic Forecasting: Quantile Regression¶

Quantile regression is a technique for estimating the conditional quantiles of a response variable. By combining the predictions of two quantile regressors, an interval can be constructed, where each model estimates one of the interval’s boundaries. For example, models trained for $Q = 0.1$ and $Q = 0.9$ produce an 80% prediction interval ($90\% - 10\% = 80\%$).

If a machine learning algorithm capable of modeling quantiles is used as the regressor in a forecaster, the predict method will return predictions for a specified quantile. By creating two forecasters, each configured with a different quantile, their predictions can be combined to generate a prediction interval.

As opposed to linear regression, which is intended to estimate the conditional mean of the response variable given certain values of the predictor variables, quantile regression aims at estimating the conditional quantiles of the response variable. For a continuous distribution function, the $\alpha$-quantile $Q_{\alpha}(x)$ is defined such that the probability of $Y$ being smaller than $Q_{\alpha}(x)$ is, for a given $X=x$, equal to $\alpha$. For example, 36% of the population values are lower than the quantile $Q=0.36$. The most known quantile is the 50%-quantile, more commonly called the median.

By combining the predictions of two quantile regressors, it is possible to build an interval. Each model estimates one of the limits of the interval. For example, the models obtained for $Q = 0.1$ and $Q = 0.9$ produce an 80% prediction interval (90% - 10% = 80%).

Several machine learning algorithms are capable of modeling quantiles. Some of them are:

Just as the squared-error loss function is used to train models that predict the mean value, a specific loss function is needed in order to train models that predict quantiles. The most common metric used for quantile regression is calles quantile loss or pinball loss:

$$\text{pinball}(y, \hat{y}) = \frac{1}{n_{\text{samples}}} \sum_{i=0}^{n_{\text{samples}}-1} \alpha \max(y_i - \hat{y}_i, 0) + (1 - \alpha) \max(\hat{y}_i - y_i, 0)$$

where $\alpha$ is the target quantile, $y$ the real value and $\hat{y}$ the quantile prediction.

It can be seen that loss differs depending on the evaluated quantile. The higher the quantile, the more the loss function penalizes underestimates, and the less it penalizes overestimates. As with MSE and MAE, the goal is to minimize its values (the lower loss, the better).

Two disadvantages of quantile regression, compared to the bootstrap approach to prediction intervals, are that each quantile needs its regressor and quantile regression is not available for all types of regression models. However, once the models are trained, the inference is much faster since no iterative process is needed.

This type of prediction intervals can be easily estimated using ForecasterDirect and ForecasterDirectMultiVariate models.

💡 Tip

For more examples on how to use probabilistic forecasting, check out the following articles:

⚠ Warning

Forecasters of type ForecasterDirect are slower than ForecasterRecursive because they require training one model per step. Although they can achieve better performance, their scalability is an important limitation when many steps need to be predicted.

In [1]:

Copied!





# Data processing
# ==============================================================================
import pandas as pd
from skforecast.datasets import fetch_dataset

# Plots
# ==============================================================================
import matplotlib.pyplot as plt
from skforecast.plot import set_dark_theme

# Modelling and Forecasting
# ==============================================================================
from lightgbm import LGBMRegressor
from skforecast.direct import ForecasterDirect
from skforecast.model_selection import TimeSeriesFold
from skforecast.model_selection import backtesting_forecaster
from skforecast.model_selection import bayesian_search_forecaster
from skforecast.metrics import calculate_coverage
from skforecast.metrics import create_mean_pinball_loss

# Configuration
# ==============================================================================
import warnings
warnings.filterwarnings('once')
# Data processing
# ==============================================================================
import pandas as pd
from skforecast.datasets import fetch_dataset

# Plots
# ==============================================================================
import matplotlib.pyplot as plt
from skforecast.plot import set_dark_theme

# Modelling and Forecasting
# ==============================================================================
from lightgbm import LGBMRegressor
from skforecast.direct import ForecasterDirect
from skforecast.model_selection import TimeSeriesFold
from skforecast.model_selection import backtesting_forecaster
from skforecast.model_selection import bayesian_search_forecaster
from skforecast.metrics import calculate_coverage
from skforecast.metrics import create_mean_pinball_loss

# Configuration
# ==============================================================================
import warnings
warnings.filterwarnings('once')

In [2]:

Copied!





# Data
# ==============================================================================
data = fetch_dataset(name="ett_m2_extended")
data = data[[
    "OT",
    "day_of_week_cos",
    "day_of_week_sin",
    "hour_cos",
    "hour_sin",
    "month_cos",
    "month_sin",
    "week_cos",
    "week_sin",
    "year",
]]
data.head(2)
# Data
# ==============================================================================
data = fetch_dataset(name="ett_m2_extended")
data = data[[
    "OT",
    "day_of_week_cos",
    "day_of_week_sin",
    "hour_cos",
    "hour_sin",
    "month_cos",
    "month_sin",
    "week_cos",
    "week_sin",
    "year",
]]
data.head(2)

ett_m2_extended
---------------
Data from an electricity transformer station was collected between July 2016 and
July 2018 (2 years x 365 days x 24 hours x 4 intervals per hour = 70,080 data
points). Each data point consists of 8 features, including the date of the
point, the predictive value "Oil Temperature (OT)", and 6 different types of
external power load features: High UseFul Load (HUFL), High UseLess Load (HULL),
Middle UseFul Load (MUFL), Middle UseLess Load (MULL), Low UseFul Load (LUFL),
Low UseLess Load (LULL). Additional variables are created based on calendar
information (year, month, week, day of the week, and hour). These variables have
been encoded using the cyclical encoding technique (sin and cos transformations)
to preserve the cyclical nature of the data.
Zhou, Haoyi & Zhang, Shanghang & Peng, Jieqi & Zhang, Shuai & Li, Jianxin &
Xiong, Hui & Zhang, Wancai. (2020). Informer: Beyond Efficient Transformer for
Long Sequence Time-Series Forecasting.
[10.48550/arXiv.2012.07436](https://arxiv.org/abs/2012.07436).
https://github.com/zhouhaoyi/ETDataset
Shape of the dataset: (69680, 16)

Out[2]:

	OT	day_of_week_cos	day_of_week_sin	hour_cos	hour_sin	month_cos	month_sin	week_cos	week_sin	year
date
2016-07-01 00:00:00	38.661999	-0.900969	-0.433884	1.0	0.0	-0.866025	-0.5	-1.0	1.224647e-16	2016
2016-07-01 00:15:00	38.223000	-0.900969	-0.433884	1.0	0.0	-0.866025	-0.5	-1.0	1.224647e-16	2016

In [3]:

Copied!





# Split train-validation-test
# ==============================================================================
exog_features = data.columns.difference(['OT']).tolist()
end_train = '2017-10-01 23:59:00'
end_validation = '2018-04-03 23:59:00'
data_train = data.loc[: end_train, :]
data_val   = data.loc[end_train:end_validation, :]
data_test  = data.loc[end_validation:, :]

print(f"Dates train      : {data_train.index.min()} --- {data_train.index.max()}  (n={len(data_train)})")
print(f"Dates validation : {data_val.index.min()} --- {data_val.index.max()}  (n={len(data_val)})")
print(f"Dates test       : {data_test.index.min()} --- {data_test.index.max()}  (n={len(data_test)})")
# Split train-validation-test
# ==============================================================================
exog_features = data.columns.difference(['OT']).tolist()
end_train = '2017-10-01 23:59:00'
end_validation = '2018-04-03 23:59:00'
data_train = data.loc[: end_train, :]
data_val   = data.loc[end_train:end_validation, :]
data_test  = data.loc[end_validation:, :]

print(f"Dates train      : {data_train.index.min()} --- {data_train.index.max()}  (n={len(data_train)})")
print(f"Dates validation : {data_val.index.min()} --- {data_val.index.max()}  (n={len(data_val)})")
print(f"Dates test       : {data_test.index.min()} --- {data_test.index.max()}  (n={len(data_test)})")

Dates train      : 2016-07-01 00:00:00 --- 2017-10-01 23:45:00  (n=43968)
Dates validation : 2017-10-02 00:00:00 --- 2018-04-03 23:45:00  (n=17664)
Dates test       : 2018-04-04 00:00:00 --- 2018-06-26 19:45:00  (n=8048)

In [4]:

Copied!





# Plot partitions
# ==============================================================================
set_dark_theme()
plt.rcParams['lines.linewidth'] = 0.5
fig, ax = plt.subplots(figsize=(8, 3))
ax.plot(data_train['OT'], label='Train')
ax.plot(data_val['OT'], label='Validation')
ax.plot(data_test['OT'], label='Test')
ax.set_title('Oil Temperature')
ax.legend();
# Plot partitions
# ==============================================================================
set_dark_theme()
plt.rcParams['lines.linewidth'] = 0.5
fig, ax = plt.subplots(figsize=(8, 3))
ax.plot(data_train['OT'], label='Train')
ax.plot(data_val['OT'], label='Validation')
ax.plot(data_test['OT'], label='Test')
ax.set_title('Oil Temperature')
ax.legend();

No description has been provided for this image

Two quantile forecasters are trained, one for the 10% quantile and another for the 90% quantile. The predictions of both forecasters can be combined to generate a prediction interval of 80% coverage.

In [5]:

Copied!





# Create forecasters: one for each bound of the interval
# ==============================================================================
# The forecasters obtained for alpha=0.1 and alpha=0.9 produce a 80% confidence
# interval (90% - 10% = 80%).

# Forecaster for quantile 10%
forecaster_q10 = ForecasterDirect(
                     regressor = LGBMRegressor(
                                     objective = 'quantile',
                                     metric    = 'quantile',
                                     alpha     = 0.1,
                                     verbose   = -1
                                 ),
                     steps = 24,
                     lags  = [1, 2, 3, 23, 24, 25, 47, 48, 49, 71, 72, 73]
                 )
                  
# Forecaster for quantile 90%
forecaster_q90 = ForecasterDirect(
                     regressor = LGBMRegressor(
                                     objective = 'quantile',
                                     metric    = 'quantile',
                                     alpha     = 0.9,
                                     verbose   = -1
                                 ),
                     steps = 24,
                     lags  = [1, 2, 3, 23, 24, 25, 47, 48, 49, 71, 72, 73]
                 )
# Create forecasters: one for each bound of the interval
# ==============================================================================
# The forecasters obtained for alpha=0.1 and alpha=0.9 produce a 80% confidence
# interval (90% - 10% = 80%).

# Forecaster for quantile 10%
forecaster_q10 = ForecasterDirect(
                     regressor = LGBMRegressor(
                                     objective = 'quantile',
                                     metric    = 'quantile',
                                     alpha     = 0.1,
                                     verbose   = -1
                                 ),
                     steps = 24,
                     lags  = [1, 2, 3, 23, 24, 25, 47, 48, 49, 71, 72, 73]
                 )
                  
# Forecaster for quantile 90%
forecaster_q90 = ForecasterDirect(
                     regressor = LGBMRegressor(
                                     objective = 'quantile',
                                     metric    = 'quantile',
                                     alpha     = 0.9,
                                     verbose   = -1
                                 ),
                     steps = 24,
                     lags  = [1, 2, 3, 23, 24, 25, 47, 48, 49, 71, 72, 73]
                 )

Next, a bayesian search is performed to find the best hyperparameters for the quantile regressors. When validating a quantile regression model, it is important to use a metric that is coherent with the quantile being evaluated. In this case, the pinball loss is used. Skforecast provides the function create_mean_pinball_loss calculate the pinball loss for a given quantile.

In [6]:

Copied!





# Bayesian search of hyper-parameters and lags for each quantile forecaster
# ==============================================================================
def search_space(trial):
    search_space  = {
        'n_estimators': trial.suggest_int('n_estimators', 100, 500, step=50),
        'max_depth': trial.suggest_int('max_depth', 3, 10, step=1),
        'learning_rate': trial.suggest_float('learning_rate', 0.001, 0.1)
    }

    return search_space

cv = TimeSeriesFold(
        steps              = 24,
        initial_train_size = len(data[:end_train]),
        refit              = False,
    )

results_grid_q10 = bayesian_search_forecaster(
                       forecaster   = forecaster_q10,
                       y            = data.loc[:end_validation, 'OT'],
                       cv           = cv,
                       metric       = create_mean_pinball_loss(alpha=0.1),
                       search_space = search_space,
                       n_trials     = 10,
                       return_best  = True
                   )

results_grid_q90 = bayesian_search_forecaster(
                       forecaster   = forecaster_q90,
                       y            = data.loc[:end_validation, 'OT'],
                       cv           = cv,
                       metric       = create_mean_pinball_loss(alpha=0.9),
                       search_space = search_space,
                       n_trials     = 10,
                       return_best  = True
                   )
# Bayesian search of hyper-parameters and lags for each quantile forecaster
# ==============================================================================
def search_space(trial):
    search_space  = {
        'n_estimators': trial.suggest_int('n_estimators', 100, 500, step=50),
        'max_depth': trial.suggest_int('max_depth', 3, 10, step=1),
        'learning_rate': trial.suggest_float('learning_rate', 0.001, 0.1)
    }

    return search_space

cv = TimeSeriesFold(
        steps              = 24,
        initial_train_size = len(data[:end_train]),
        refit              = False,
    )

results_grid_q10 = bayesian_search_forecaster(
                       forecaster   = forecaster_q10,
                       y            = data.loc[:end_validation, 'OT'],
                       cv           = cv,
                       metric       = create_mean_pinball_loss(alpha=0.1),
                       search_space = search_space,
                       n_trials     = 10,
                       return_best  = True
                   )

results_grid_q90 = bayesian_search_forecaster(
                       forecaster   = forecaster_q90,
                       y            = data.loc[:end_validation, 'OT'],
                       cv           = cv,
                       metric       = create_mean_pinball_loss(alpha=0.9),
                       search_space = search_space,
                       n_trials     = 10,
                       return_best  = True
                   )

  0%|          | 0/10 [00:00<?, ?it/s]

`Forecaster` refitted using the best-found lags and parameters, and the whole data set: 
  Lags: [ 1  2  3 23 24 25 47 48 49 71 72 73] 
  Parameters: {'n_estimators': 250, 'max_depth': 3, 'learning_rate': 0.04040638127771271}
  Backtesting metric: 0.41020051875078006

  0%|          | 0/10 [00:00<?, ?it/s]

`Forecaster` refitted using the best-found lags and parameters, and the whole data set: 
  Lags: [ 1  2  3 23 24 25 47 48 49 71 72 73] 
  Parameters: {'n_estimators': 450, 'max_depth': 8, 'learning_rate': 0.061491327557080706}
  Backtesting metric: 0.350142766138876

Predictions (backtesting)¶

Once the quantile forecasters are trained, they can be used to predict each of the bounds of the forecasting interval.

In [7]:

Copied!





# Backtesting on test data
# ==============================================================================
cv = TimeSeriesFold(
         steps              = 24, 
         initial_train_size = len(data.loc[:end_validation]),
         refit              = False
     )

metric_q10, predictions_q10 = backtesting_forecaster(
                                  forecaster = forecaster_q10,
                                  y          = data['OT'],
                                  cv         = cv,
                                  metric     = create_mean_pinball_loss(alpha=0.1)
                              )

metric_q90, predictions_q90 = backtesting_forecaster(
                                  forecaster = forecaster_q90,
                                  y          = data['OT'],
                                  cv         = cv,
                                  metric     = create_mean_pinball_loss(alpha=0.9)
                              )

pred_intervals = pd.concat([predictions_q10, predictions_q90], axis=1)
pred_intervals.columns = ['lower_bound', 'upper_bound']
pred_intervals.head()
# Backtesting on test data
# ==============================================================================
cv = TimeSeriesFold(
         steps              = 24, 
         initial_train_size = len(data.loc[:end_validation]),
         refit              = False
     )

metric_q10, predictions_q10 = backtesting_forecaster(
                                  forecaster = forecaster_q10,
                                  y          = data['OT'],
                                  cv         = cv,
                                  metric     = create_mean_pinball_loss(alpha=0.1)
                              )

metric_q90, predictions_q90 = backtesting_forecaster(
                                  forecaster = forecaster_q90,
                                  y          = data['OT'],
                                  cv         = cv,
                                  metric     = create_mean_pinball_loss(alpha=0.9)
                              )

pred_intervals = pd.concat([predictions_q10, predictions_q90], axis=1)
pred_intervals.columns = ['lower_bound', 'upper_bound']
pred_intervals.head()

  0%|          | 0/336 [00:00<?, ?it/s]

  0%|          | 0/336 [00:00<?, ?it/s]

Out[7]:

	lower_bound	upper_bound
2018-04-04 00:00:00	31.848286	32.268329
2018-04-04 00:15:00	31.563152	32.150039
2018-04-04 00:30:00	31.269266	32.271917
2018-04-04 00:45:00	31.081806	32.139591
2018-04-04 01:00:00	30.670913	32.049578

In [8]:

Copied!





# Plot interval
# ==============================================================================
fig, ax = plt.subplots(figsize=(7, 3))
data.loc[end_validation:, 'OT'].plot(ax=ax, label='Real value', color='orange')
ax.fill_between(
    data.loc[end_validation:].index,
    pred_intervals['lower_bound'],
    pred_intervals['upper_bound'],
    color = 'gray',
    alpha = 0.6,
    zorder = 1,
    label = '80% prediction interval'
)
ax.set_xlabel('')
ax.set_title("Predicted intervals")
ax.legend();
# Plot interval
# ==============================================================================
fig, ax = plt.subplots(figsize=(7, 3))
data.loc[end_validation:, 'OT'].plot(ax=ax, label='Real value', color='orange')
ax.fill_between(
    data.loc[end_validation:].index,
    pred_intervals['lower_bound'],
    pred_intervals['upper_bound'],
    color = 'gray',
    alpha = 0.6,
    zorder = 1,
    label = '80% prediction interval'
)
ax.set_xlabel('')
ax.set_title("Predicted intervals")
ax.legend();

In [9]:

Copied!





# Plot intervals with zoom ['2018-05-01', '2018-05-15']
# ==============================================================================
fig, ax = plt.subplots(figsize=(7, 3))
data.loc[end_validation:, 'OT'].plot(ax=ax, label='Real value', color='orange')
ax.fill_between(
    data.loc[end_validation:].index,
    pred_intervals['lower_bound'],
    pred_intervals['upper_bound'],
    color = 'gray',
    alpha = 0.6,
    zorder = 1,
    label = '80% prediction interval'
)
ax.set_xlabel('')
ax.set_title("Predicted intervals (zoom in)")
ax.set_xlim(['2018-05-01', '2018-05-15']);
# Plot intervals with zoom ['2018-05-01', '2018-05-15']
# ==============================================================================
fig, ax = plt.subplots(figsize=(7, 3))
data.loc[end_validation:, 'OT'].plot(ax=ax, label='Real value', color='orange')
ax.fill_between(
    data.loc[end_validation:].index,
    pred_intervals['lower_bound'],
    pred_intervals['upper_bound'],
    color = 'gray',
    alpha = 0.6,
    zorder = 1,
    label = '80% prediction interval'
)
ax.set_xlabel('')
ax.set_title("Predicted intervals (zoom in)")
ax.set_xlim(['2018-05-01', '2018-05-15']);

In [10]:

Copied!





# Predicted interval coverage (on test data)
# ==============================================================================
coverage = calculate_coverage(
               y_true      = data.loc[end_validation:, 'OT'],
               lower_bound = predictions_q10["pred"], 
               upper_bound = predictions_q90["pred"]
           )
print(f"Predicted interval coverage: {round(100 * coverage, 2)} %")

# Area of the interval
# ==============================================================================
area = (predictions_q90["pred"] - predictions_q10["pred"]).sum()
print(f"Area of the interval: {round(area, 2)}")
# Predicted interval coverage (on test data)
# ==============================================================================
coverage = calculate_coverage(
               y_true      = data.loc[end_validation:, 'OT'],
               lower_bound = predictions_q10["pred"], 
               upper_bound = predictions_q90["pred"]
           )
print(f"Predicted interval coverage: {round(100 * coverage, 2)} %")

# Area of the interval
# ==============================================================================
area = (predictions_q90["pred"] - predictions_q10["pred"]).sum()
print(f"Area of the interval: {round(area, 2)}")

Predicted interval coverage: 77.37 %
Area of the interval: 38980.18

The prediction intervals generated by quantile regression achieve an empirical coverage close to the nominal coverage of 80%.