Window and custom features¶

In forecasting time series data, it can be useful to consider additional characteristics beyond just the lagged values. For instance, the moving average of the previous n values can help capture the trend in the series.

The ForecasterAutoregCustom and ForecasterAutoregMultiSeriesCustom classes are similar to ForecasterAutoreg and ForecasterAutoregMultiSeries, respectively, but they allow the user to define their own function for generating features.

To create a custom features, the user needs to write a function that takes a time series as input (a NumPy ndarray) and outputs another NumPy ndarray containing the predictors. The Forecaster parameters used to specify this function are fun_predictors and window_size and name_predictors (optional).

Libraries¶

In [1]:

Copied!





# Libraries
# ==============================================================================
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from skforecast.datasets import fetch_dataset
from skforecast.ForecasterAutoregCustom import ForecasterAutoregCustom
from skforecast.ForecasterAutoregMultiSeriesCustom import ForecasterAutoregMultiSeriesCustom
from lightgbm import LGBMRegressor
from sklearn.metrics import mean_squared_error
# Libraries
# ==============================================================================
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from skforecast.datasets import fetch_dataset
from skforecast.ForecasterAutoregCustom import ForecasterAutoregCustom
from skforecast.ForecasterAutoregMultiSeriesCustom import ForecasterAutoregMultiSeriesCustom
from lightgbm import LGBMRegressor
from sklearn.metrics import mean_squared_error

ForecasterAutoregCustom¶

In [2]:

Copied!





# Download data
# ==============================================================================
data = fetch_dataset(
    name="h2o", raw=True, kwargs_read_csv={"names": ["y", "datetime"], "header": 0}
)

# Data preprocessing
# ==============================================================================
data['datetime'] = pd.to_datetime(data['datetime'], format='%Y-%m-%d')
data = data.set_index('datetime')
data = data.asfreq('MS')
data = data['y']
data = data.sort_index()

# Split train-test
# ==============================================================================
steps = 36
data_train = data[:-steps]
data_test  = data[-steps:]

# Plot
# ==============================================================================
fig, ax = plt.subplots(figsize=(6, 3))
data_train.plot(ax=ax, label='train')
data_test.plot(ax=ax, label='test')
ax.legend();
# Download data
# ==============================================================================
data = fetch_dataset(
    name="h2o", raw=True, kwargs_read_csv={"names": ["y", "datetime"], "header": 0}
)

# Data preprocessing
# ==============================================================================
data['datetime'] = pd.to_datetime(data['datetime'], format='%Y-%m-%d')
data = data.set_index('datetime')
data = data.asfreq('MS')
data = data['y']
data = data.sort_index()

# Split train-test
# ==============================================================================
steps = 36
data_train = data[:-steps]
data_test  = data[-steps:]

# Plot
# ==============================================================================
fig, ax = plt.subplots(figsize=(6, 3))
data_train.plot(ax=ax, label='train')
data_test.plot(ax=ax, label='test')
ax.legend();

h2o
---
Monthly expenditure ($AUD) on corticosteroid drugs that the Australian health
system had between 1991 and 2008.
Hyndman R (2023). fpp3: Data for Forecasting: Principles and Practice(3rd
Edition). http://pkg.robjhyndman.com/fpp3package/,https://github.com/robjhyndman
/fpp3package, http://OTexts.com/fpp3.
Shape of the dataset: (204, 2)

No description has been provided for this image

Window and custom features¶

The following functions calculate some of the most commonly used window and custom features in time series forecasting:

Calculate the first 10 lags of the time series.
Calculate the moving average with a window of 20.
Calculate the moving standard deviation with a window of 20.
Calculate the moving minimum and maximum with a window of 20.
Exponential Moving Average (EMA) with a window of 20 for smoothing.
The difference between the last value and the moving average to capture recent trends compared to the longer term average.

In [3]:

Copied!





# Custom function to create features
# ==============================================================================
def create_predictors(y):
    """
    Create first 10 lags of a time series.
    Calculate moving average with window 20.
    Calculate the moving standard deviation with window 20.
    Calculate moving minimum and maximum with window 20.
    Calculate exponential moving average with window 20.
    Calculate the difference between the last value and the moving average.
    """

    # Calculate lags
    lags = y[-1:-11:-1]  # window size needed = 10

    # Calculate moving average
    mean = np.mean(y[-20:])  # window size needed = 20

    # Calculate moving standard deviation
    std_dev = np.std(y[-20:])  # window size needed = 20

    # Calculate moving minimum and maximum
    min_val = np.min(y[-20:])
    max_val = np.max(y[-20:])

    # Calculate exponential moving average
    ema = np.mean(y[-20:] * np.exp(-np.linspace(1, 20, 20) / 20))

    # Calculate the difference between the last value and the moving average
    diff_mean = y[-1] - mean

    # Combine all predictors
    predictors = np.hstack([lags, mean, std_dev, min_val, max_val, ema, diff_mean])

    return predictors


feature_names = [f"lag {i}" for i in range(1, 11)] + [
    "moving_avg_20",
    "moving_std_20",
    "moving_min_20",
    "moving_max_20",
    "ema_20",
    "diff_mean",
]
# Custom function to create features
# ==============================================================================
def create_predictors(y):
    """
    Create first 10 lags of a time series.
    Calculate moving average with window 20.
    Calculate the moving standard deviation with window 20.
    Calculate moving minimum and maximum with window 20.
    Calculate exponential moving average with window 20.
    Calculate the difference between the last value and the moving average.
    """

    # Calculate lags
    lags = y[-1:-11:-1]  # window size needed = 10

    # Calculate moving average
    mean = np.mean(y[-20:])  # window size needed = 20

    # Calculate moving standard deviation
    std_dev = np.std(y[-20:])  # window size needed = 20

    # Calculate moving minimum and maximum
    min_val = np.min(y[-20:])
    max_val = np.max(y[-20:])

    # Calculate exponential moving average
    ema = np.mean(y[-20:] * np.exp(-np.linspace(1, 20, 20) / 20))

    # Calculate the difference between the last value and the moving average
    diff_mean = y[-1] - mean

    # Combine all predictors
    predictors = np.hstack([lags, mean, std_dev, min_val, max_val, ema, diff_mean])

    return predictors


feature_names = [f"lag {i}" for i in range(1, 11)] + [
    "moving_avg_20",
    "moving_std_20",
    "moving_min_20",
    "moving_max_20",
    "ema_20",
    "diff_mean",
]

⚠ Warning

The window_size parameter specifies the size of the data window that fun_predictors uses to generate each row of predictors. Choosing the appropriate value for this parameter is crucial to avoid losing data when constructing the training matrices.

In this case, the window_size required by the mean is the largest (most restrictive), so window_size = 20.

Train forecaster¶

In [4]:

Copied!





# Create and fit forecaster
# ==============================================================================
forecaster = ForecasterAutoregCustom(
                 regressor       = LGBMRegressor(random_state=123, n_jobs=-1, verbose=-1),
                 fun_predictors  = create_predictors,
                 name_predictors = feature_names,
                 window_size     = 20 # window_size needed by the mean is the most restrictive one
             )

forecaster.fit(y=data_train)
forecaster
# Create and fit forecaster
# ==============================================================================
forecaster = ForecasterAutoregCustom(
                 regressor       = LGBMRegressor(random_state=123, n_jobs=-1, verbose=-1),
                 fun_predictors  = create_predictors,
                 name_predictors = feature_names,
                 window_size     = 20 # window_size needed by the mean is the most restrictive one
             )

forecaster.fit(y=data_train)
forecaster

Out[4]:

======================= 
ForecasterAutoregCustom 
======================= 
Regressor: LGBMRegressor(n_jobs=-1, random_state=123, verbose=-1) 
Predictors created with function: create_predictors 
Transformer for y: None 
Transformer for exog: None 
Window size: 20 
Weight function included: False 
Differentiation order: None 
Exogenous included: False 
Type of exogenous variable: None 
Exogenous variables names: None 
Training range: [Timestamp('1991-07-01 00:00:00'), Timestamp('2005-06-01 00:00:00')] 
Training index type: DatetimeIndex 
Training index frequency: MS 
Regressor parameters: {'boosting_type': 'gbdt', 'class_weight': None, 'colsample_bytree': 1.0, 'importance_type': 'split', 'learning_rate': 0.1, 'max_depth': -1, 'min_child_samples': 20, 'min_child_weight': 0.001, 'min_split_gain': 0.0, 'n_estimators': 100, 'n_jobs': -1, 'num_leaves': 31, 'objective': None, 'random_state': 123, 'reg_alpha': 0.0, 'reg_lambda': 0.0, 'subsample': 1.0, 'subsample_for_bin': 200000, 'subsample_freq': 0, 'verbose': -1} 
fit_kwargs: {} 
Creation date: 2024-07-29 18:17:19 
Last fit date: 2024-07-29 18:17:19 
Skforecast version: 0.13.0 
Python version: 3.11.5 
Forecaster id: None

Prediction¶

In [5]:

Copied!





# Predict
# ==============================================================================
steps = 36
predictions = forecaster.predict(steps=steps)
predictions.head(3)
# Predict
# ==============================================================================
steps = 36
predictions = forecaster.predict(steps=steps)
predictions.head(3)

Out[5]:

2005-07-01    0.951229
2005-08-01    0.969577
2005-09-01    1.127967
Freq: MS, Name: pred, dtype: float64

In [6]:

Copied!





# Plot predictions
# ==============================================================================
fig, ax = plt.subplots(figsize=(6, 3))
data_train.plot(ax=ax, label='train')
data_test.plot(ax=ax, label='test')
predictions.plot(ax=ax, label='predictions')
ax.legend();
# Plot predictions
# ==============================================================================
fig, ax = plt.subplots(figsize=(6, 3))
data_train.plot(ax=ax, label='train')
data_test.plot(ax=ax, label='test')
predictions.plot(ax=ax, label='predictions')
ax.legend();

In [7]:

Copied!





# Prediction error
# ==============================================================================
error_mse = mean_squared_error(
                y_true = data_test,
                y_pred = predictions
            )

print(f"Test error (MSE): {error_mse}")
# Prediction error
# ==============================================================================
error_mse = mean_squared_error(
                y_true = data_test,
                y_pred = predictions
            )

print(f"Test error (MSE): {error_mse}")

Test error (MSE): 0.03392508498594471

Feature importances¶

In [8]:

Copied!

# Predictors importances
# ==============================================================================
forecaster.get_feature_importances()
# Predictors importances
# ==============================================================================
forecaster.get_feature_importances()

Out[8]:

	feature	importance
0	lag 1	62
1	lag 2	60
11	moving_std_20	56
4	lag 5	55
14	ema_20	36
9	lag 10	35
2	lag 3	32
6	lag 7	30
15	diff_mean	30
3	lag 4	19
8	lag 9	17
10	moving_avg_20	17
5	lag 6	15
12	moving_min_20	14
7	lag 8	9
13	moving_max_20	0

Extract training matrices¶

In [9]:

Copied!





# Training matrix (X)
# ==============================================================================
X, y = forecaster.create_train_X_y(data_train)
X.head()
# Training matrix (X)
# ==============================================================================
X, y = forecaster.create_train_X_y(data_train)
X.head()

Out[9]:

	lag 1	lag 2	lag 3	lag 4	lag 5	lag 6	lag 7	lag 8	lag 9	lag 10	moving_avg_20	moving_std_20	moving_min_20	moving_max_20	ema_20	diff_mean
datetime
1993-03-01	0.387554	0.751503	0.771258	0.595223	0.568606	0.534761	0.475463	0.483389	0.410534	0.361801	0.496401	0.124727	0.33622	0.771258	0.298062	-0.108846
1993-04-01	0.427283	0.387554	0.751503	0.771258	0.595223	0.568606	0.534761	0.475463	0.483389	0.410534	0.496275	0.124796	0.33622	0.771258	0.299714	-0.068992
1993-05-01	0.413890	0.427283	0.387554	0.751503	0.771258	0.595223	0.568606	0.534761	0.475463	0.483389	0.496924	0.124331	0.33622	0.771258	0.302649	-0.083034
1993-06-01	0.428859	0.413890	0.427283	0.387554	0.751503	0.771258	0.595223	0.568606	0.534761	0.475463	0.496759	0.124419	0.33622	0.771258	0.304446	-0.067900
1993-07-01	0.470126	0.428859	0.413890	0.427283	0.387554	0.751503	0.771258	0.595223	0.568606	0.534761	0.495638	0.124552	0.33622	0.771258	0.304076	-0.025512

In [10]:

Copied!

# Target variable (y)
# ==============================================================================
y.head()
# Target variable (y)
# ==============================================================================
y.head()

Out[10]:

datetime
1993-03-01    0.427283
1993-04-01    0.413890
1993-05-01    0.428859
1993-06-01    0.470126
1993-07-01    0.509210
Freq: MS, Name: y, dtype: float64

Differentiation¶

Time series differentiation involves computing the differences between consecutive observations in the time series. When it comes to training forecasting models, differentiation offers the advantage of focusing on relative rates of change rather than directly attempting to model the absolute values. Once the predictions have been estimated, this transformation can be easily reversed to restore the values to their original scale.

💡 Tip

To learn more about modeling time series differentiation, visit our example: Modelling time series trend with tree based models.

In [11]:

Copied!





# Create and fit forecaster
# ==============================================================================
forecaster = ForecasterAutoregCustom(
                 regressor       = LGBMRegressor(random_state=123, n_jobs=-1, verbose=-1),
                 fun_predictors  = create_predictors,
                 name_predictors = feature_names,
                 window_size     = 20, # window_size needed by the mean is the most restrictive one
                 differentiation = 1
             )

forecaster.fit(y=data_train)
forecaster
# Create and fit forecaster
# ==============================================================================
forecaster = ForecasterAutoregCustom(
                 regressor       = LGBMRegressor(random_state=123, n_jobs=-1, verbose=-1),
                 fun_predictors  = create_predictors,
                 name_predictors = feature_names,
                 window_size     = 20, # window_size needed by the mean is the most restrictive one
                 differentiation = 1
             )

forecaster.fit(y=data_train)
forecaster

Out[11]:

======================= 
ForecasterAutoregCustom 
======================= 
Regressor: LGBMRegressor(n_jobs=-1, random_state=123, verbose=-1) 
Predictors created with function: create_predictors 
Transformer for y: None 
Transformer for exog: None 
Window size: 20 
Weight function included: False 
Differentiation order: 1 
Exogenous included: False 
Type of exogenous variable: None 
Exogenous variables names: None 
Training range: [Timestamp('1991-07-01 00:00:00'), Timestamp('2005-06-01 00:00:00')] 
Training index type: DatetimeIndex 
Training index frequency: MS 
Regressor parameters: {'boosting_type': 'gbdt', 'class_weight': None, 'colsample_bytree': 1.0, 'importance_type': 'split', 'learning_rate': 0.1, 'max_depth': -1, 'min_child_samples': 20, 'min_child_weight': 0.001, 'min_split_gain': 0.0, 'n_estimators': 100, 'n_jobs': -1, 'num_leaves': 31, 'objective': None, 'random_state': 123, 'reg_alpha': 0.0, 'reg_lambda': 0.0, 'subsample': 1.0, 'subsample_for_bin': 200000, 'subsample_freq': 0, 'verbose': -1} 
fit_kwargs: {} 
Creation date: 2024-07-29 18:17:20 
Last fit date: 2024-07-29 18:17:20 
Skforecast version: 0.13.0 
Python version: 3.11.5 
Forecaster id: None

In [12]:

Copied!





# Predict
# ==============================================================================
predictions = forecaster.predict(steps=36)
predictions.head(3)
# Predict
# ==============================================================================
predictions = forecaster.predict(steps=36)
predictions.head(3)

Out[12]:

2005-07-01    0.891992
2005-08-01    0.899966
2005-09-01    1.031671
Freq: MS, Name: pred, dtype: float64

ForecasterAutoregMultiSeriesCustom¶

This Forecaster aims to model multiple time series simultaneously. It is based on the ForecasterAutoregMultiSeries class, but it allows the user to define their own function for generating predictors.

Please refer to the Global Forecasting Models: Independent multi-series forecasting user guide if you want to learn more about this type of forecasting.

💡 Tip

ForecasterAutoregMultiSeriesCustom API Reference.

✎ Note

Skforecast offers additional approaches to create Global Forecasting Models:

In [13]:

Copied!





# Data download
# ==============================================================================
data = fetch_dataset(name="items_sales")
data.head()
# Data download
# ==============================================================================
data = fetch_dataset(name="items_sales")
data.head()

items_sales
-----------
Simulated time series for the sales of 3 different items.
Simulated data.
Shape of the dataset: (1097, 3)

Out[13]:

	item_1	item_2	item_3
date
2012-01-01	8.253175	21.047727	19.429739
2012-01-02	22.777826	26.578125	28.009863
2012-01-03	27.549099	31.751042	32.078922
2012-01-04	25.895533	24.567708	27.252276
2012-01-05	21.379238	18.191667	20.357737

In [14]:

Copied!





# Split data into train-val-test
# ==============================================================================
end_train = '2014-07-15 23:59:00'
data_train = data.loc[:end_train, :].copy()
data_test  = data.loc[end_train:, :].copy()

print(
    f"Train dates : {data_train.index.min()} --- {data_train.index.max()}  "
    f"(n={len(data_train)})"
)
print(
    f"Test dates  : {data_test.index.min()} --- {data_test.index.max()}  "
    f"(n={len(data_test)})"
)
# Split data into train-val-test
# ==============================================================================
end_train = '2014-07-15 23:59:00'
data_train = data.loc[:end_train, :].copy()
data_test  = data.loc[end_train:, :].copy()

print(
    f"Train dates : {data_train.index.min()} --- {data_train.index.max()}  "
    f"(n={len(data_train)})"
)
print(
    f"Test dates  : {data_test.index.min()} --- {data_test.index.max()}  "
    f"(n={len(data_test)})"
)

Train dates : 2012-01-01 00:00:00 --- 2014-07-15 00:00:00  (n=927)
Test dates  : 2014-07-16 00:00:00 --- 2015-01-01 00:00:00  (n=170)

In [15]:

Copied!





# Plot time series
# ==============================================================================
fig, axes = plt.subplots(nrows=3, ncols=1, figsize=(7.5, 4), sharex=True)

data_train['item_1'].plot(label='train', ax=axes[0])
data_test['item_1'].plot(label='test', ax=axes[0])
axes[0].set_xlabel('')
axes[0].set_ylabel('sales')
axes[0].set_title('Item 1')
axes[0].legend()

data_train['item_2'].plot(label='train', ax=axes[1])
data_test['item_2'].plot(label='test', ax=axes[1])
axes[1].set_xlabel('')
axes[1].set_ylabel('sales')
axes[1].set_title('Item 2')

data_train['item_3'].plot(label='train', ax=axes[2])
data_test['item_3'].plot(label='test', ax=axes[2])
axes[2].set_xlabel('')
axes[2].set_ylabel('sales')
axes[2].set_title('Item 3')

fig.tight_layout()
plt.show();
# Plot time series
# ==============================================================================
fig, axes = plt.subplots(nrows=3, ncols=1, figsize=(7.5, 4), sharex=True)

data_train['item_1'].plot(label='train', ax=axes[0])
data_test['item_1'].plot(label='test', ax=axes[0])
axes[0].set_xlabel('')
axes[0].set_ylabel('sales')
axes[0].set_title('Item 1')
axes[0].legend()

data_train['item_2'].plot(label='train', ax=axes[1])
data_test['item_2'].plot(label='test', ax=axes[1])
axes[1].set_xlabel('')
axes[1].set_ylabel('sales')
axes[1].set_title('Item 2')

data_train['item_3'].plot(label='train', ax=axes[2])
data_test['item_3'].plot(label='test', ax=axes[2])
axes[2].set_xlabel('')
axes[2].set_ylabel('sales')
axes[2].set_title('Item 3')

fig.tight_layout()
plt.show();

Train¶

In [16]:

Copied!





# Custom function to create features
# ==============================================================================
def create_predictors(y):
    """
    Create first 10 lags of a time series.
    Calculate moving average with window 20.
    Calculate the moving standard deviation with window 20.
    Calculate moving minimum and maximum with window 20.
    Calculate exponential moving average with window 20.
    Calculate the difference between the last value and the moving average.
    """

    # Calculate lags
    lags = y[-1:-11:-1]  # window size needed = 10

    # Calculate moving average
    mean = np.mean(y[-20:])  # window size needed = 20

    # Calculate moving standard deviation
    std_dev = np.std(y[-20:])  # window size needed = 20

    # Calculate moving minimum and maximum
    min_val = np.min(y[-20:])
    max_val = np.max(y[-20:])

    # Calculate exponential moving average
    ema = np.mean(y[-20:] * np.exp(-np.linspace(1, 20, 20) / 20))

    # Calculate the difference between the last value and the moving average
    diff_mean = y[-1] - mean

    # Combine all predictors
    predictors = np.hstack([lags, mean, std_dev, min_val, max_val, ema, diff_mean])

    return predictors


feature_names = [f"lag {i}" for i in range(1, 11)] + [
    "moving_avg_20",
    "moving_std_20",
    "moving_min_20",
    "moving_max_20",
    "ema_20",
    "diff_mean",
]
# Custom function to create features
# ==============================================================================
def create_predictors(y):
    """
    Create first 10 lags of a time series.
    Calculate moving average with window 20.
    Calculate the moving standard deviation with window 20.
    Calculate moving minimum and maximum with window 20.
    Calculate exponential moving average with window 20.
    Calculate the difference between the last value and the moving average.
    """

    # Calculate lags
    lags = y[-1:-11:-1]  # window size needed = 10

    # Calculate moving average
    mean = np.mean(y[-20:])  # window size needed = 20

    # Calculate moving standard deviation
    std_dev = np.std(y[-20:])  # window size needed = 20

    # Calculate moving minimum and maximum
    min_val = np.min(y[-20:])
    max_val = np.max(y[-20:])

    # Calculate exponential moving average
    ema = np.mean(y[-20:] * np.exp(-np.linspace(1, 20, 20) / 20))

    # Calculate the difference between the last value and the moving average
    diff_mean = y[-1] - mean

    # Combine all predictors
    predictors = np.hstack([lags, mean, std_dev, min_val, max_val, ema, diff_mean])

    return predictors


feature_names = [f"lag {i}" for i in range(1, 11)] + [
    "moving_avg_20",
    "moving_std_20",
    "moving_min_20",
    "moving_max_20",
    "ema_20",
    "diff_mean",
]

⚠ Warning

The window_size parameter specifies the size of the data window that fun_predictors uses to generate each row of predictors. Choosing the appropriate value for this parameter is crucial to avoid losing data when constructing the training matrices.

In this case, the window_size required by the mean is the largest (most restrictive), so window_size = 20.

In [17]:

Copied!





# Create and fit forecaster
# ==============================================================================
forecaster = ForecasterAutoregMultiSeriesCustom(
                 regressor       = LGBMRegressor(random_state=123, n_jobs=-1, verbose=-1),
                 fun_predictors  = create_predictors,
                 name_predictors = feature_names,
                 window_size     = 20 # window_size needed by the mean is the most restrictive one
             )

forecaster.fit(series=data_train)
forecaster
# Create and fit forecaster
# ==============================================================================
forecaster = ForecasterAutoregMultiSeriesCustom(
                 regressor       = LGBMRegressor(random_state=123, n_jobs=-1, verbose=-1),
                 fun_predictors  = create_predictors,
                 name_predictors = feature_names,
                 window_size     = 20 # window_size needed by the mean is the most restrictive one
             )

forecaster.fit(series=data_train)
forecaster

Out[17]:

================================== 
ForecasterAutoregMultiSeriesCustom 
================================== 
Regressor: LGBMRegressor(n_jobs=-1, random_state=123, verbose=-1) 
Predictors created with function: create_predictors 
Transformer for series: None 
Transformer for exog: None 
Series encoding: ordinal_category 
Window size: 20 
Series levels (names): ['item_1', 'item_2', 'item_3'] 
Series weights: None 
Weight function included: False 
Differentiation order: None 
Exogenous included: False 
Type of exogenous variable: None 
Exogenous variables names: None 
Training range: ["'item_1': ['2012-01-01', '2014-07-15']", "'item_2': ['2012-01-01', '2014-07-15']", "'item_3': ['2012-01-01', '2014-07-15']"] 
Training index type: DatetimeIndex 
Training index frequency: D 
Regressor parameters: boosting_type: gbdt, class_weight: None, colsample_bytree: 1.0, importance_type: split, learning_rate: 0.1, ... 
fit_kwargs: {} 
Creation date: 2024-07-29 18:17:23 
Last fit date: 2024-07-29 18:17:24 
Skforecast version: 0.13.0 
Python version: 3.11.5 
Forecaster id: None

Predict¶

If no value is specified for the levels argument, predictions will be computed for all available levels.

In [18]:

Copied!





# Predict
# ==============================================================================
steps = 36
predictions = forecaster.predict(steps=steps, levels=None)
predictions.head(3)
# Predict
# ==============================================================================
steps = 36
predictions = forecaster.predict(steps=steps, levels=None)
predictions.head(3)

Out[18]:

	item_1	item_2	item_3
2014-07-16	25.691486	11.033865	13.535166
2014-07-17	25.404832	10.536406	12.908039
2014-07-18	25.234903	12.085611	13.228491

Training matrices¶

In [19]:

Copied!





# Training matrix (X)
# ==============================================================================
X_train, y_train = forecaster.create_train_X_y(data_train)
X_train.head()
# Training matrix (X)
# ==============================================================================
X_train, y_train = forecaster.create_train_X_y(data_train)
X_train.head()

Out[19]:

	lag 1	lag 2	lag 3	lag 4	lag 5	lag 6	lag 7	lag 8	lag 9	lag 10	moving_avg_20	moving_std_20	moving_min_20	moving_max_20	ema_20	diff_mean	_level_skforecast
date
2012-01-21	24.018768	24.772249	29.245869	26.636444	20.228468	18.976196	20.784194	21.758617	21.751748	21.717691	21.954065	4.161762	8.253175	29.245869	13.267277	2.064703	0
2012-01-22	22.503533	24.018768	24.772249	29.245869	26.636444	20.228468	18.976196	20.784194	21.758617	21.751748	22.666583	2.727996	18.976196	29.245869	13.948776	-0.163050	0
2012-01-23	20.794986	22.503533	24.018768	24.772249	29.245869	26.636444	20.228468	18.976196	20.784194	21.758617	22.567441	2.758017	18.976196	29.245869	13.907556	-1.772455	0
2012-01-24	23.981037	20.794986	22.503533	24.018768	24.772249	29.245869	26.636444	20.228468	18.976196	20.784194	22.389038	2.536512	18.976196	29.245869	13.684263	1.592000	0
2012-01-25	28.018830	23.981037	20.794986	22.503533	24.018768	24.772249	29.245869	26.636444	20.228468	18.976196	22.495203	2.718929	18.976196	29.245869	13.606471	5.523627	0

In [20]:

Copied!

# Target variable (y)
# ==============================================================================
y_train.head()
# Target variable (y)
# ==============================================================================
y_train.head()

Out[20]:

date
2012-01-21    22.503533
2012-01-22    20.794986
2012-01-23    23.981037
2012-01-24    28.018830
2012-01-25    28.747482
Name: y, dtype: float64