Skip to content

Mean Absolute Error (MAE) Loss Function in PyTorch

Mean Absolute Error (MAE) Loss Function in PyTorch Cover Image

In this tutorial, you’ll learn about the Mean Absolute Error (MAE) or L1 Loss Function in PyTorch for developing your deep-learning models. The MAE loss function is an important criterion for evaluating regression models in PyTorch.

This tutorial provides a comprehensive overview of the L1 loss function in PyTorch. By the end of this tutorial, you’ll have learned the following:

  • What the Mean Absolute Error (L1 loss) is and how it’s calculated
  • How the MAE is implemented in PyTorch
  • When to use the mean absolute error in your deep learning models

If you’re looking for a quick answer, mean absolute error loss can be implemented in PyTorch for developing deep learning models as shown below:

# Implementing Mean Absolute Error Loss in PyTorch
import torch
import torch.nn as nn

criterion = nn.L1Loss()
loss = criterion(preds, vals)

Understanding the Mean Absolute Error Loss Function

The Mean Absolute Error (MAE) is a straightforward metric quantifying the average absolute difference between predicted values and their corresponding truth values. It measures how well your model’s predictions match the actual data.

The MAE is calculated as below:

MAE = (1 / N) * Σ|predicted - actual|

As the name suggests, the loss formula calculates the average value of the absolute differences between the predicted value and the actual value. The main difference between this and the mean squared error is that the MSE squares the difference, while the MAE does not.

The mean absolute error provides the following benefits:

  1. Robustness to outliers: In deep learning, you may encounter data points that are very different from the norm. These values can deeply influence loss functions. However, MAE’s absolute difference approach ensures that the impact of each outlier is linear and doesn’t escalate dramatically. This property makes MAE well-suited for datasets containing noisy or extreme values.
  2. Interpretability: MAE’s concept of measuring the average absolute deviation between predicted and actual values aligns well with intuitive human understanding. This is especially true because the loss values keep the same units, which makes the loss even simpler to understand.
  3. Training Stability: The linear nature of MAE ensures that gradients don’t explode as much as they might with other loss functions like Mean Squared Error (MSE). This stability contributes to smoother convergence during training and can result in quicker model convergence.

Now that you have a good understanding of how the L1 loss is calculated and what benefits it provides, let’s dive into seeing how the loss function is calculated in PyTorch.

Implementing the Mean Absolute Error in PyTorch

In PyTorch, the MAE loss function is implemented using the nn.L1Loss class. The L1 loss is the same as the mean absolute error. Let’s take a look at how the class can be implemented. We can create two tensors: one containing sample predicted values and another containing actual values.

# Calculating MAE Loss in PyTorch
import torch
import torch.nn as nn

# Create sample values
predicted = torch.tensor([2.5, 4.8, 6.9, 9.5])
actual = torch.tensor([3.0, 5.0, 7.0, 9.0])

# Create and use criterion
criterion = nn.L1Loss()
loss = criterion(predicted, actual)

print(f'MAE Loss: {loss}')

# Returns: MSE Loss: 0.32499992847442627

In the code block above, we first imported our libraries. We then created two tensors, which represented true and predicted values. Then, we define our loss function by instantiating the nn.L1Loss class. From there, we can simply pass in our predicted and actual values to calculate our loss.

Now that you know how to implement the mean absolute error, let’s take a look at how to interpret it.

Interpreting Mean Absolute Error in Deep Learning

The mean absolute error is always 0 or positive. When the MAE (or L1 loss) is larger, this is an indication that the regression model doesn’t accurately predict the model.

A lower MAE indicates that the predictions are, on average, closer to the actual values. Inversely, a higher MAE value indicates that the predicted values are further away from the actual values.

While MAE provides valuable insights, it’s important to note that it treats all errors equally. This can be both an advantage and a limitation. For instance, it might not capture the impact of larger errors as effectively as other loss functions like Mean Squared Error (MSE). Thus, considering the characteristics of your data and the significance of different error magnitudes is crucial when interpreting MAE.

When To Use Mean Absolute Error in Deep Learning

Choosing a loss function is an important part of developing deep learning models. Different loss functions are more suitable for different datasets. The mean absolute error allows you to better understand the error attributed to a model. This can be an important consideration if you need to be able to explain the results to lay audiences.

Similarly, the MAE is more robust to outliers. Compared to other loss functions, such as the mean squared error, the L1 loss is less influenced by really large errors. This can prevent skewing your loss. It can be helpful to understand your data prior to choosing a loss function to seeing how this might be impacted.

While MAE has numerous benefits, there are cases where it might not be the best choice. For instance, if minimizing small errors is a priority, MSE could be more suitable due to its squaring effect. This is because smaller errors (those smaller than 1) will become even smaller when squared.


In this tutorial, we’ve explored the Mean Absolute Error (MAE) or L1 Loss Function in PyTorch and its significance in developing deep learning models. By now, you should have a solid grasp of how the MAE loss function operates and when it’s advantageous to use it.

We started by understanding the fundamentals of MAE, uncovering its role in calculating the average absolute difference between predicted and actual values. This intuitive metric provides valuable insights into the model’s performance, enabling you to gauge the accuracy of predictions effectively.

We then dived into the implementation of MAE in PyTorch, showing you how to calculate the loss using the nn.L1Loss class. Through concise code examples, you learned how to evaluate model predictions against ground truth values, yielding a clear measure of the model’s predictive capabilities.

Interpreting the MAE introduced you to the significance of the calculated MAE values. While it treats all errors equally, this property has its own set of advantages and limitations. You’re now equipped to interpret MAE values in the context of your specific problem domain, understanding how they reflect the balance between prediction accuracy and error distribution.

Knowing when to leverage MAE in deep learning is crucial. The loss function’s robustness to outliers, interpretability, balanced error consideration, and training stability make it a valuable choice for various scenarios. It’s particularly effective when transparent error interpretation and resilience to extreme values are required.

To learn more about the L1Loss, check out the official documentation.

Nik Piepenbreier

Nik is the author of and has over a decade of experience working with data analytics, data science, and Python. He specializes in teaching developers how to use Python for data science using hands-on tutorials.View Author posts

Leave a Reply

Your email address will not be published. Required fields are marked *