MASE is a scale-independent error metric that measures the accuracy of
a forecast. It is the mean absolute error of the forecast divided by the
mean absolute error of a naive forecast in the training set. The naive
forecast is the one obtained by shifting the time series by one period.
If y_train is a list of numpy arrays or pandas Series, it is considered
that each element is the true value of the target variable in the training
set for each time series. In this case, the naive forecast is calculated
for each time series separately.
Parameters:
Name
Type
Description
Default
y_true
pandas Series, numpy ndarray
True values of the target variable.
required
y_pred
pandas Series, numpy ndarray
Predicted values of the target variable.
required
y_train
list, pandas Series, numpy ndarray
True values of the target variable in the training set. If list, it
is consider that each element is the true value of the target variable
in the training set for each time series.
defmean_absolute_scaled_error(y_true:np.ndarray|pd.Series,y_pred:np.ndarray|pd.Series,y_train:list[float]|np.ndarray|pd.Series,)->float:""" Mean Absolute Scaled Error (MASE) MASE is a scale-independent error metric that measures the accuracy of a forecast. It is the mean absolute error of the forecast divided by the mean absolute error of a naive forecast in the training set. The naive forecast is the one obtained by shifting the time series by one period. If y_train is a list of numpy arrays or pandas Series, it is considered that each element is the true value of the target variable in the training set for each time series. In this case, the naive forecast is calculated for each time series separately. Parameters ---------- y_true : pandas Series, numpy ndarray True values of the target variable. y_pred : pandas Series, numpy ndarray Predicted values of the target variable. y_train : list, pandas Series, numpy ndarray True values of the target variable in the training set. If `list`, it is consider that each element is the true value of the target variable in the training set for each time series. Returns ------- mase : float MASE value. """# NOTE: When using this metric in validation, `y_train` doesn't include# the first window_size observations used to create the predictors and/or# rolling features.ifnotisinstance(y_true,(pd.Series,np.ndarray)):raiseTypeError("`y_true` must be a pandas Series or numpy ndarray.")ifnotisinstance(y_pred,(pd.Series,np.ndarray)):raiseTypeError("`y_pred` must be a pandas Series or numpy ndarray.")ifnotisinstance(y_train,(list,pd.Series,np.ndarray)):raiseTypeError("`y_train` must be a list, pandas Series or numpy ndarray.")ifisinstance(y_train,list):forxiny_train:ifnotisinstance(x,(pd.Series,np.ndarray)):raiseTypeError("When `y_train` is a list, each element must be a pandas Series ""or numpy ndarray.")iflen(y_true)!=len(y_pred):raiseValueError("`y_true` and `y_pred` must have the same length.")iflen(y_true)==0orlen(y_pred)==0:raiseValueError("`y_true` and `y_pred` must have at least one element.")ifisinstance(y_train,list):naive_forecast=np.concatenate([np.diff(x)forxiny_train])else:naive_forecast=np.diff(y_train)mase=np.mean(np.abs(y_true-y_pred))/np.nanmean(np.abs(naive_forecast))returnmase
RMSSE is a scale-independent error metric that measures the accuracy of
a forecast. It is the root mean squared error of the forecast divided by
the root mean squared error of a naive forecast in the training set. The
naive forecast is the one obtained by shifting the time series by one period.
If y_train is a list of numpy arrays or pandas Series, it is considered
that each element is the true value of the target variable in the training
set for each time series. In this case, the naive forecast is calculated
for each time series separately.
Parameters:
Name
Type
Description
Default
y_true
pandas Series, numpy ndarray
True values of the target variable.
required
y_pred
pandas Series, numpy ndarray
Predicted values of the target variable.
required
y_train
list, pandas Series, numpy ndarray
True values of the target variable in the training set. If list, it
is consider that each element is the true value of the target variable
in the training set for each time series.
defroot_mean_squared_scaled_error(y_true:np.ndarray|pd.Series,y_pred:np.ndarray|pd.Series,y_train:list[float]|np.ndarray|pd.Series,)->float:""" Root Mean Squared Scaled Error (RMSSE) RMSSE is a scale-independent error metric that measures the accuracy of a forecast. It is the root mean squared error of the forecast divided by the root mean squared error of a naive forecast in the training set. The naive forecast is the one obtained by shifting the time series by one period. If y_train is a list of numpy arrays or pandas Series, it is considered that each element is the true value of the target variable in the training set for each time series. In this case, the naive forecast is calculated for each time series separately. Parameters ---------- y_true : pandas Series, numpy ndarray True values of the target variable. y_pred : pandas Series, numpy ndarray Predicted values of the target variable. y_train : list, pandas Series, numpy ndarray True values of the target variable in the training set. If list, it is consider that each element is the true value of the target variable in the training set for each time series. Returns ------- rmsse : float RMSSE value. """# NOTE: When using this metric in validation, `y_train` doesn't include# the first window_size observations used to create the predictors and/or# rolling features.ifnotisinstance(y_true,(pd.Series,np.ndarray)):raiseTypeError("`y_true` must be a pandas Series or numpy ndarray.")ifnotisinstance(y_pred,(pd.Series,np.ndarray)):raiseTypeError("`y_pred` must be a pandas Series or numpy ndarray.")ifnotisinstance(y_train,(list,pd.Series,np.ndarray)):raiseTypeError("`y_train` must be a list, pandas Series or numpy ndarray.")ifisinstance(y_train,list):forxiny_train:ifnotisinstance(x,(pd.Series,np.ndarray)):raiseTypeError("When `y_train` is a list, each element must be a pandas Series ""or numpy ndarray.")iflen(y_true)!=len(y_pred):raiseValueError("`y_true` and `y_pred` must have the same length.")iflen(y_true)==0orlen(y_pred)==0:raiseValueError("`y_true` and `y_pred` must have at least one element.")ifisinstance(y_train,list):naive_forecast=np.concatenate([np.diff(x)forxiny_train])else:naive_forecast=np.diff(y_train)rmsse=np.sqrt(np.mean((y_true-y_pred)**2))/np.sqrt(np.nanmean(naive_forecast**2))returnrmsse
Compute the Symmetric Mean Absolute Percentage Error (SMAPE).
SMAPE is a relative error metric used to measure the accuracy
of forecasts. Unlike MAPE, it is symmetric and prevents division
by zero by averaging the absolute values of actual and predicted values.
The result is expressed as a percentage and ranges from 0%
(perfect prediction) to 200% (maximum error).
defsymmetric_mean_absolute_percentage_error(y_true:np.ndarray|pd.Series,y_pred:np.ndarray|pd.Series)->float:""" Compute the Symmetric Mean Absolute Percentage Error (SMAPE). SMAPE is a relative error metric used to measure the accuracy of forecasts. Unlike MAPE, it is symmetric and prevents division by zero by averaging the absolute values of actual and predicted values. The result is expressed as a percentage and ranges from 0% (perfect prediction) to 200% (maximum error). Parameters ---------- y_true : numpy ndarray, pandas Series True values of the target variable. y_pred : numpy ndarray, pandas Series Predicted values of the target variable. Returns ------- smape : float SMAPE value as a percentage. Notes ----- When both `y_true` and `y_pred` are zero, the corresponding term is treated as zero to avoid division by zero. Examples -------- ```python import numpy as np from skforecast.metrics import symmetric_mean_absolute_percentage_error y_true = np.array([100, 200, 0]) y_pred = np.array([110, 180, 10]) result = symmetric_mean_absolute_percentage_error(y_true, y_pred) print(f"SMAPE: {result:.2f}%") # SMAPE: 73.35% ``` """ifnotisinstance(y_true,(pd.Series,np.ndarray)):raiseTypeError("`y_true` must be a pandas Series or numpy ndarray.")ifnotisinstance(y_pred,(pd.Series,np.ndarray)):raiseTypeError("`y_pred` must be a pandas Series or numpy ndarray.")iflen(y_true)!=len(y_pred):raiseValueError("`y_true` and `y_pred` must have the same length.")iflen(y_true)==0orlen(y_pred)==0:raiseValueError("`y_true` and `y_pred` must have at least one element.")numerator=np.abs(y_true-y_pred)denominator=(np.abs(y_true)+np.abs(y_pred))/2# NOTE: Avoid division by zeromask=denominator!=0smape_values=np.zeros_like(denominator)smape_values[mask]=numerator[mask]/denominator[mask]smape=100*np.mean(smape_values)returnsmape
defcreate_mean_pinball_loss(alpha:float)->callable:""" Create pinball loss, also known as quantile loss, for a given quantile. Internally, it uses the `mean_pinball_loss` function from scikit-learn. Parameters ---------- alpha: float Quantile for which the Pinball loss is calculated. Must be between 0 and 1, inclusive. Returns ------- mean_pinball_loss_q: callable Mean Pinball loss for the given quantile. """ifnot(0<=alpha<=1):raiseValueError("alpha must be between 0 and 1, both inclusive.")defmean_pinball_loss_q(y_true,y_pred):returnmean_pinball_loss(y_true,y_pred,alpha=alpha)returnmean_pinball_loss_q
defadd_y_train_argument(func:Callable)->Callable:""" Add `y_train` argument to a function if it is not already present. Parameters ---------- func : callable Function to which the argument is added. Returns ------- wrapper : callable Function with `y_train` argument added. """sig=inspect.signature(func)if"y_train"insig.parameters:returnfuncnew_params=list(sig.parameters.values())+[inspect.Parameter("y_train",inspect.Parameter.KEYWORD_ONLY,default=None)]new_sig=sig.replace(parameters=new_params)@wraps(func)defwrapper(*args,y_train=None,**kwargs):returnfunc(*args,**kwargs)wrapper.__signature__=new_sigreturnwrapper