--- title: PyTorch losses keywords: fastai sidebar: home_sidebar summary: "Training losses." description: "Training losses." nb_path: "nbs/losses__pytorch.ipynb" ---
{% raw %}
{% endraw %} {% raw %}
{% endraw %} {% raw %}

divide_no_nan[source]

divide_no_nan(a, b)

Auxiliary funtion to handle divide by 0

{% endraw %} {% raw %}
{% endraw %}

PyTorch Train Losses

{% raw %}

MAPELoss[source]

MAPELoss(y, y_hat, mask=None)

MAPE Loss

Calculates Mean Absolute Percentage Error between y and y_hat. MAPE measures the relative prediction accuracy of a forecasting method by calculating the percentual deviation of the prediction and the true value at a given time and averages these devations over the length of the series. As defined in: https://en.wikipedia.org/wiki/Mean_absolute_percentage_error

Parameters

y: tensor (batch_size, output_size) actual values in torch tensor. y_hat: tensor (batch_size, output_size) predicted values in torch tensor. mask: tensor (batch_size, output_size) specifies date stamps per serie to consider in loss

Returns

mape: Mean absolute percentage error.

{% endraw %} {% raw %}
{% endraw %} {% raw %}

MSELoss[source]

MSELoss(y, y_hat, mask=None)

MSE Loss

Calculates Mean Squared Error between y and y_hat. MAPE measures the relative prediction accuracy of a forecasting method by calculating the percentual deviation of the prediction and the true value at a given time and averages these devations over the length of the series.

Parameters

y: tensor (batch_size, output_size) actual values in torch tensor. y_hat: tensor (batch_size, output_size) predicted values in torch tensor. mask: tensor (batch_size, output_size) specifies date stamps per serie to consider in loss

Returns

mse: Mean Squared Error.

{% endraw %} {% raw %}
{% endraw %} {% raw %}

RMSELoss[source]

RMSELoss(y, y_hat, mask=None)

RMSE Loss

Calculates Mean Squared Error between y and y_hat. MAPE measures the relative prediction accuracy of a forecasting method by calculating the percentual deviation of the prediction and the true value at a given time and averages these devations over the length of the series.

Parameters

y: tensor (batch_size, output_size) actual values in torch tensor. y_hat: tensor (batch_size, output_size) predicted values in torch tensor. mask: tensor (batch_size, output_size) specifies date stamps per serie to consider in loss

Returns

rmse: Root Mean Squared Error.

{% endraw %} {% raw %}
{% endraw %} {% raw %}

SMAPELoss[source]

SMAPELoss(y, y_hat, mask=None)

SMAPE2 Loss

Calculates Symmetric Mean Absolute Percentage Error. SMAPE measures the relative prediction accuracy of a forecasting method by calculating the relative deviation of the prediction and the true value scaled by the sum of the absolute values for the prediction and true value at a given time, then averages these devations over the length of the series. This allows the SMAPE to have bounds between 0% and 200% which is desireble compared to normal MAPE that may be undetermined.

Parameters

y: tensor (batch_size, output_size) actual values in torch tensor. y_hat: tensor (batch_size, output_size) predicted values in torch tensor.

Returns

smape: symmetric mean absolute percentage error

References

[1] https://robjhyndman.com/hyndsight/smape/ (Makridakis 1993)

{% endraw %} {% raw %}
{% endraw %} {% raw %}

MASELoss[source]

MASELoss(y, y_hat, y_insample, seasonality, mask=None)

Calculates the M4 Mean Absolute Scaled Error.

MASE measures the relative prediction accuracy of a forecasting method by comparinng the mean absolute errors of the prediction and the true value against the mean absolute errors of the seasonal naive model.

Parameters

seasonality: int main frequency of the time series Hourly 24, Daily 7, Weekly 52, Monthly 12, Quarterly 4, Yearly 1 y: tensor (batch_size, output_size) actual test values y_hat: tensor (batch_size, output_size) predicted values y_train: tensor (batch_size, input_size) actual insample values for Seasonal Naive predictions

Returns

mase: mean absolute scaled error

References

[1] https://robjhyndman.com/papers/mase.pdf

{% endraw %} {% raw %}
{% endraw %} {% raw %}

MAELoss[source]

MAELoss(y, y_hat, mask=None)

MAE Loss

Calculates Mean Absolute Error between y and y_hat. MAE measures the relative prediction accuracy of a forecasting method by calculating the deviation of the prediction and the true value at a given time and averages these devations over the length of the series.

Parameters

y: tensor (batch_size, output_size) actual values in torch tensor. y_hat: tensor (batch_size, output_size) predicted values in torch tensor. mask: tensor (batch_size, output_size) specifies date stamps per serie to consider in loss

Returns

mae: Mean absolute error.

{% endraw %} {% raw %}
{% endraw %} {% raw %}

PinballLoss[source]

PinballLoss(y, y_hat, mask=None, tau=0.5)

Pinball Loss Computes the pinball loss between y and y_hat.

Parameters

y: tensor (batch_size, output_size) actual values in torch tensor. y_hat: tensor (batch_size, output_size) predicted values in torch tensor. tau: float, between 0 and 1 the slope of the pinball loss, in the context of quantile regression, the value of tau determines the conditional quantile level.

Returns

pinball: average accuracy for the predicted quantile

{% endraw %} {% raw %}
{% endraw %}

ES-RNN PyTorch loss

{% raw %}

LevelVariabilityLoss[source]

LevelVariabilityLoss(levels, level_variability_penalty)

Level Variability Loss Computes the variability penalty for the level.

Parameters

levels: tensor with shape (batch, n_time) levels obtained from exponential smoothing component of ESRNN level_variability_penalty: float this parameter controls the strength of the penalization to the wigglines of the level vector, induces smoothness in the output

Returns

level_var_loss: wiggliness loss for the level vector

{% endraw %} {% raw %}
{% endraw %} {% raw %}

SmylLoss[source]

SmylLoss(y, y_hat, levels, mask, tau, level_variability_penalty=0.0)

Computes the Smyl Loss that combines level variability with with Pinball loss. windows_y: tensor of actual values, shape (n_windows, batch_size, window_size). windows_y_hat: tensor of predicted values, shape (n_windows, batch_size, window_size). levels: levels obtained from exponential smoothing component of ESRNN. tensor with shape (batch, n_time). return: smyl_loss.

{% endraw %} {% raw %}
{% endraw %}

Multi-quantile PyTorch loss

MQLoss definition and testing

{% raw %}

MQLoss[source]

MQLoss(y, y_hat, quantiles, mask=None)

MQLoss

Calculates Average Multi-quantile Loss function, for a given set of quantiles, based on the absolute difference between predicted and true values.

Parameters

y: tensor (batch_size, output_size) actual values in torch tensor. y_hat: tensor (batch_size, output_size, n_quantiles) predicted values in torch tensor. mask: tensor (batch_size, output_size, n_quantiles) specifies date stamps per serie to consider in loss quantiles: tensor(n_quantiles) quantiles to estimate from the distribution of y.

Returns

lq: tensor(n_quantiles) average multi-quantile loss.

{% endraw %} {% raw %}
{% endraw %} {% raw %}

wMQLoss[source]

wMQLoss(y, y_hat, quantiles, mask=None)

wMQLoss

Calculates Average Multi-quantile Loss function, for a given set of quantiles, based on the absolute difference between predicted and true values.

Parameters

y: tensor (batch_size, output_size) actual values in torch tensor. y_hat: tensor (batch_size, output_size, n_quantiles) predicted values in torch tensor. mask: tensor (batch_size, output_size, n_quantiles) specifies date stamps per serie to consider in loss quantiles: tensor(n_quantiles) quantiles to estimate from the distribution of y.

Returns

lq: tensor(n_quantiles) average multi-quantile loss.

{% endraw %} {% raw %}
{% endraw %}

Checks for PyTorch train losses

{% raw %}
from torch import nn, optim
from torch.utils.data import DataLoader, Dataset

import numpy as np
import time
from scipy.stats import hmean
import matplotlib.pyplot as plt
%matplotlib inline
{% endraw %} {% raw %}
class Model(nn.Module):  

    def __init__(self, horizon, n_quantiles):
        super(Model, self).__init__()
        self.horizon = horizon
        self.n_quantiles = n_quantiles
        self.linear_layer = nn.Linear(in_features=n_obs, 
                                      out_features=horizon * n_quantiles, 
                                      bias=False)

    def forward(self, x):
        y_hat = self.linear_layer(x)
        y_hat = y_hat.view(-1, self.horizon, self.n_quantiles)
        return y_hat
    
class Data(Dataset):
    
    # Constructor
    def __init__(self, Y, X):
        self.X = X
        self.Y = Y
        self.len = Y.shape[0]

    # Getter
    def __getitem__(self, index):
        return self.X[index], self.Y[index]
    
    # Get Length
    def __len__(self):
        return self.len
{% endraw %} {% raw %}
t.cuda.manual_seed(7)

# Sample data
n_ts = 1000
n_obs = horizon = 10
mean = 0.0 # to generate random numbers from N(mean, std)
std = 7.0 # to generate random numbers from N(mean, std)
start = 0.05 # First quantile
end = 0.95 # Last quantiles
steps = 4 # Number of quantiles

# Hyperparameters
batch_size = 500
lr = 0.08
epochs = 100

# Sample data
quantiles = t.Tensor([0.0500, 0.3500, 0.6500, 0.9500])
print(f'quantiles:\n{quantiles}')
Y = t.normal(mean=mean, std=std, size=(n_ts, n_obs))
X = t.ones(size=(n_ts, n_obs))

Y_test = t.normal(mean=mean, std=std, size=(n_ts, horizon))
X_test = t.ones(size=(n_ts, horizon))
print(f'Y.shape: {Y.shape}, X.shape: {X.shape}')
print(f'Y_test.shape: {Y_test.shape}, X_test.shape: {X_test.shape}')
quantiles:
tensor([0.0500, 0.3500, 0.6500, 0.9500])
Y.shape: torch.Size([1000, 10]), X.shape: torch.Size([1000, 10])
Y_test.shape: torch.Size([1000, 10]), X_test.shape: torch.Size([1000, 10])
{% endraw %} {% raw %}
model = Model(horizon=horizon, n_quantiles=len(quantiles))
dataset = Data(X=X, Y=Y)
dataloader = DataLoader(dataset=dataset, batch_size=batch_size)
optimizer = optim.Adam(model.parameters(), lr=lr)

def train_model(model, epochs, print_progress=False):

    start = time.time()
    i = 0 
    training_trajectory = {'epoch': [],
                           'train_loss': []}
    
    for epoch in range(epochs):
        for x, y in dataloader:
            
            i += 1
            y_hat = model(x)
            #training_loss = wMQLoss(y=y, y_hat=y_hat, quantiles=quantiles)
            training_loss = MQLoss(y=y, y_hat=y_hat, quantiles=quantiles)
            if i % (epoch + 1) == 0: 
                training_trajectory['epoch'].append(i)
                training_trajectory['train_loss'].append(training_loss.detach().numpy())
            optimizer.zero_grad()
            training_loss.backward()
            optimizer.step()

            display_string = 'Step: {}, Time: {:03.3f}, Insample {}: {:.5f}'.format(i, 
                                                                                    time.time()-start, 
                                                                                    "MQLoss", 
                                                                                    training_loss.cpu().data.numpy())
            if print_progress: print(display_string)

    return model, training_trajectory
{% endraw %}