Seaborn displot - Distribution Plots in Python • datagy

In this tutorial, you’ll learn how to create Seaborn distribution plots using the sns.displot() function. Distribution plots show how a variable (or multiple variables) is distributed. Seaborn provides many different distribution data visualization functions that include creating histograms or kernel density estimates.

Seaborn provides dedicated functions for both of these visualizations. So, why would you want to use the displot() function? The Seaborn displot() function is a figure-level function, rather than an axes-level function. This opens up different possibilities in terms of how you put together your visualizations.

By the end of this tutorial, you’ll have learned the following:

What the Seaborn displot() function is
When to use the Seaborn displot() function instead of the dedicated functions
How to plot multiple plots using the sns.displot() figure-level function
How to customize titles, colors, and more

Table of Contents

Understanding the Seaborn displot() Function

The Seaborn displot() function is used to create figure-level relational plots onto a Seaborn FacetGrid. You can customize the type of visualization that is created by using the kind= parameter.

The Seaborn displot() function provides a figure-level interface for creating categorical plots. This means that the function allows you to map to a figure, rather than an axes object. This opens up much more possibilities.

Let’s take a look at how the function is written:

# Understanding the Seaborn displot() Function
import seaborn as sns
sns.displot(data=None, *, x=None, y=None, hue=None, row=None, col=None, weights=None, kind='hist', rug=False, rug_kws=None, log_scale=None, legend=True, palette=None, hue_order=None, hue_norm=None, color=None, col_wrap=None, row_order=None, col_order=None, height=5, aspect=1, facet_kws=None, **kwargs)

The function allows you to plot the following visualization types, modified by the kind= parameter:

Histograms with the Seaborn histplot() function
Kernel density estimate plots with the Seaborn kdeplot() function
Empirical cumulative distribution function plots with the Seaborn ecdfplot() function
Rugplots with the Seaborn rugplot() function

Some of these visualizations are a little bit more specific and niche. The image below shows what a similar distribution looks like using different plots:

Different charts created with Seaborn distplot — Different charts created with Seaborn displot

The function has a very similar interface to the other distribution plotting functions. Let’s take a look at some of the key options:

data= provides the data to plot via a Pandas DataFrame
x= and y= provide the variables to plot on the x- and y-axis respectively
hue= adds an additional variable to plot via a color mapping

Additionally, the function offers some extra parameters available only in the displot() function. Let’s explore these:

kind= determines what type of chart to create. By default, it will create a histogram, using the keyword argument 'hist'
row= allows you to split your dataset into additional rows of visualizations
col= allows you to split your dataset into additional columns of visualizations
height= and aspect= control the size of your data visualization

Now that you have a strong understanding of what’s possible, let’s dive into how we can use the function to create useful data visualizations.

Loading a Sample Dataset

To follow along with this tutorial, let’s use a dataset provided by the Seaborn library. We’ll use the popular Tips dataset available through the sns.load_dataset() function.

Let’s see how we can read the dataset and explore its first five rows:

# Exploring the Sample Dataset
import seaborn as sns
df = sns.load_dataset('tips')
print(df.head())

# Returns:
#    total_bill   tip     sex smoker  day    time  size
# 0       16.99  1.01  Female     No  Sun  Dinner     2
# 1       10.34  1.66    Male     No  Sun  Dinner     3
# 2       21.01  3.50    Male     No  Sun  Dinner     3
# 3       23.68  3.31    Male     No  Sun  Dinner     2
# 4       24.59  3.61  Female     No  Sun  Dinner     4

We can see that we have a variety of variables available to us, including some categorical ones as well as some continuous ones.

Creating a Basic displot with Seaborn

By default, the Seaborn displot() function will create a histogram. In order to create the most basic visualization, we can simply pass in the following parameters:

data= to pass in our DataFrame
x= or y= to pass in the column labels that we want to explore in a histogram

Let’s see what this code looks like:

# Seaborn displot() Will Default to Show a Histogram
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')

sns.displot(data=df, x='tip')
plt.show()

In the code block above, we passed in our DataFrame df as well as the 'tip' column label. This returned the following visualization:

Creating a Histogram with Seaborn distplot — Creating a Histogram with Seaborn displot

We can see that because we’re plotting a single variable along the x-axis and Seaborn returns a histogram. The plot allows us to explore the distribution of the data in that column.

Creating a Kernel Density Estimate Plot with Seaborn displot

While the Seaborn displot() function will default to creating histograms, we can also create KDE plots by passing in kind='kde'. Let’s see what this looks like:

# Use kind= to Modify the Type of Plot Used
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')

sns.displot(data=df, x='tip', kind='kde')
plt.show()

In the code block above, we added one additional keyword argument: kind=. This allowed us to create an entirely different data visualization, as shown below:

Creating a KDE Plot with Seaborn distplot — Creating a KDE Plot with Seaborn displot

Because the displot() function will actually use the kdeplot() function under the hood, the behavior is the same. This means that we can use the different keyword arguments that the kdeplot() function provides.

Modifying Seaborn displot with Color

We can add additional detail to our Seaborn graphs by using color. This allows you to add additional dimensions (or columns of data) to your visualization. This means that, while our graphs will remain 2-dimensional, we can actually plot additional dimensions.

We can add these using the hue= parameter to add additional parameters in color. Let’s explore how we can add additional levels of detail using color.

Adding Color to Seaborn Displot

To add an additional variable into your Seaborn displot(), you can use the hue= parameter to pass in a DataFrame column that will break the data into multiple colors.

Seaborn will create a color for each of the different unique values in that column. If you’re working with categorical data, Seaborn will add one color for each unique value.

Adding Color Styles versus Adding Color Dimensions

In this case, we’ll be adding color to represent a different dimension of data. If, instead, you wanted to control the styling of your plot, you could use the palette= parameter. For the remainder of the tutorial, we’ll apply a style to make the default styling a little more aesthetic.

Let’s see how we can use Seaborn to add more detail to our plot using the hue= parameter:

# Add Color Using hue
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')

sns.displot(data=df, x='tip', hue='sex', multiple='stack')
plt.show()

In the code block above, we passed in hue='sex'. This means that we want to color the points in our scatterplot differently based on the gender of the staff. This returns the following image:

Adding Additional Variables with Hue in Seaborn distplot — Adding Additional Variables with Hue in Seaborn displot

We can see that the data visualization is now much clearer. We can clearly see differences in the data better. You can learn more about how to control color in Seaborn histograms by checking out my complete tutorial.

Combining a Histogram with a KDE Plot in Seaborn displot

We can easily combine the histogram that the Seaborn displot() function creates with a kernel density estimate. Rather than creating a separate axes object, the function allows you to pass in kde=True which will draw the estimate on the histogram.

# Combine a Histogram with the KDE
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')

sns.displot(data=df, x='tip', kde=True)
plt.show()

In the code block above, we created a simple histogram with the displot() function but instructed the function to draw the kernel density estimate. This returns the image below:

Adding a KDE to a Seaborn Histogram with distplot — Adding a KDE to a Seaborn Histogram with displot

In the following section, you’ll learn how to plot a bivariate distribution using the Seaborn displot function.

Plotting a Bivariate Distribution in Seaborn displot

By default, Seaborn will plot the distribution of a single variable. However, we can plot bivariate distribution plots using the displot function simply by passing column labels into both the x= and y= parameters.

# Drawing Bivariate Distributions
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')

sns.displot(data=df, x='tip', y='total_bill')
plt.show()

In the code block above, we didn’t specify what type of graph we wanted to produce. Because of this, Seaborn defaults to a histogram, which returns a heat map of the distribution along the two variables.

Plotting a Bivariate Distribution Using Seaborn distplot — Plotting a Bivariate Distribution Using Seaborn displot

Similarly, we can plot a bivariate kernel density estimate. In this case, we need to specify this using the kind= parameter.

# Drawing Bivariate Distributions with a KDE
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')

sns.displot(data=df, x='tip', y='total_bill', kind='kde')
plt.show()

In the code block above, we added in kind='kde' which plots a kernel density estimate. When plotting a bivariate distribution, this returns the image below:

Plotting a Bivariate Distribution Using Seaborn distplot in a KDE Plot — Plotting a Bivariate Distribution Using Seaborn displot in a KDE Plot

Because we’re creating a figure-level object, we can also customize the visualization by adding a rugplot directly. This is what you’ll learn in the following section.

Adding a Rugplot to a Seaborn displot

Rather than needing to call the rugplot function explicitly, Seaborn’s displot allows you to add a rug plot using the rug=True argument. This simplifies the creation of the plot and makes the code significantly cleaner.

# Adding a Rugplot
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')

sns.displot(data=df, x='tip', kind='kde', rug=True)
plt.show()

In the code block above, we passed in rug=True, which plotted a rug plot on the histogram, as shown below:

Adding a Rugplot to a Seaborn distplot — Adding a Rugplot to a Seaborn displot

In the following section, you’ll learn how to create small multiples of plots in Seaborn.

Creating Subsets of Plots with Rows and Columns

Seaborn provides significant flexibility in creating subsets of plots (or, subplots) by spreading data across rows and columns of data. This allows you to generate “small-multiples” of plots.

Rather than splitting a visualization using color or style (though you can do this, too), Seaborn will split the visualization into multiple subplots. However, rather than needing to explicitly define the subplots, Seaborn will plot them onto a figure FacetGrid for you.

Let’s now explore how we can add columns of data visualizations first.

Adding Columns to Seaborn displot

In order to create columns of subplots, we can use the col= parameter. The parameter accepts either a Pandas DataFrame column label or an array of data. Let’s split our data visualization into columns based on the stock that they belong to:

# Creating Columns of Small Multiples
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')

sns.displot(data=df, x='tip', col='day')
plt.show()

In the code block above, we instructed Seaborn to create columns of small multiples with the 'day' column. This means that Seaborn will create an individual subplot in the broader FacetGrid for each unique value in the 'day' column.

Adding Columns of Small Multiples in Seaborn distplot — Adding Columns of Small Multiples in Seaborn displot

But, what happens when we have a lot of unique values? Seaborn will actually keep adding more and more columns.

Because of this, we can wrap the columns using the col_wrap= parameter. The parameter accepts an integer representing how many columns we should have before the charts are wrapped down to another row.

# Adding Column Wrapping for Small Multiples
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')

sns.displot(data=df, x='tip', col='day', col_wrap=3)
plt.show()

This returns the following data visualization, where our small multiples have been wrapped around the second column:

Wrapping Columns in Seaborn distplot with Small Multiples — Wrapping Columns in Seaborn displot with Small Multiples

In the following section, you’ll learn how to also add additional rows of visualizations.

Adding Rows to Seaborn displot

Seaborn also allows you to pass in rows of small multiples. This works in the same way as adding columns. However, you can also combine the rows= parameter with the col= parameter to create rows and columns of small multiples.

Let’s see what this looks like:

# Creating Rows and Columns of Small Multiples
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')

sns.displot(data=df, x='tip', row='sex', col='day')
plt.show()

In the code block above, we passed in row='sex' and col='day' to split the small multiples based on both of these columns. This returns the following data visualization:

Adding Rows and Columns of Small Multiples Using Seaborn distplot — Adding Rows and Columns of Small Multiples Using Seaborn displot

Let’s now take a look at how we can customize the data visualizations by adding titles and axis labels in our charts.

Changing Titles and Axis Labels in Seaborn displot

Adding titles and descriptive axis labels is a great way to make your data visualization more communicative. In many cases, your readers will want to know specifically what a data point and graph represent. Because of this, it’s important to understand how to customize these in Seaborn.

Adding a Title to a Seaborn displot

To add a title to a Seaborn displot(), we can use the fig.suptitle() method available in Matplotlib. In order to do this, we’ll need to first adjust the spacing of our figure object. This process can be a bit heuristic and require some trial and error.

Take a look at the code block below:

# Adding a Title to a displot
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')

dist = sns.displot(data=df, x='tip', row='sex', col='day')
dist.fig.subplots_adjust(top=0.92)
dist.fig.suptitle('Comparing Tip Amounts')
plt.show()

In the code block above, we made a number of important changes:

We filtered the DataFrame to make the visual easier to see
We assigned the displot to a variable, dist
We then adjusted the top margin using fig.subplots_adjust()
Then, we passed in a suptitle() onto the figure object

This returned the following data visualization:

Adding a Title to a Seaborn distplot — Adding a Title to a Seaborn displot

Similarly, we can customize the titles of each of the subplots that we create. Let’s take a look at that next.

Adding Titles to Rows and Columns in Seaborn displot

Seaborn provides incredibly flexible formatting options for styling small multiples created with the col= and row= parameters.

# Changing Small Multiple Titles
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')

dist = sns.displot(data=df, x='tip', row='sex', col='day')
dist.fig.subplots_adjust(top=0.92)
dist.fig.suptitle('Comparing Tip Amounts')
dist.set_titles(row_template='Staff Gender: {row_name}', col_template='Day of Week: {col_name}')
plt.show()

In the code block above, we used the .set_titles() method which is available to FacetGrid objects. The method allows you to use the row_template= and col_template= parameters which allow you to access the col_name and row_name variables in f-string like formatting.

This returns the data visualization below:

Customizing Small Multiples Titles in Seaborn distplot — Customizing Small Multiples Titles in Seaborn displot

In the following section, you’ll learn how to customize the axis labels in a Seaborn displot.

Changing Axis Labels in Seaborn displot

By default, Seaborn will use the column labels as the axis labels in the visualization. In many cases, however, this isn’t a very descriptive title to use. Because the displot() function returns a FacetGrid object, we can use helper methods to solve this, including:

.set_xlabel() which sets the x-axis label
.set_ylabel() which sets the y-axis label
.set_axis_labels() which sets both the x- and y-axis labels at once

Let’s see what this looks like in Seaborn:

# Adding Axis Labels to Small Multiples
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')

dist = sns.displot(data=df, x='tip', row='sex', col='day')
dist.fig.subplots_adjust(top=0.92)
dist.fig.suptitle('Comparing Tip Amounts')
dist.set_titles(row_template='Staff Gender: {row_name}', col_template='Day of Week: {col_name}')

dist.set_xlabels('Tip Amount')
dist.set_ylabels('# of Transactions')

plt.show()

In the code block above, we added two additional lines of code toward the end to customize the axis labels of our data visualization. This returns the following data visualization:

Modifying Axis Labels in Seaborn distplot — Modifying Axis Labels in Seaborn displot

In the section below, you’ll learn how to change the size of a Seaborn displot.

Changing the Size of a Seaborn Replot

Because the Seaborn displot() function returns a FacetGrid object, we can easily modify the size of the figure object that is returned. In order to do this, we can use the two following parameters:

height= which determines the height in inches of each facet
aspect= which determines the aspect ratio, so that the width is height * aspect

Let’s see how we can change the size of a simpler data visualization in Seaborn:

# Changing a Figure Size
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')

sns.displot(data=df, x='tip', kde=True, height=5, aspect=1.6)
plt.show()

In the code block above, we passed in height=5, aspect=1.6. This means that the height of the facet will be 5 inches, while the width will be 8 inches (5 * 1.6). This returns the following data visualization:

Changing the Figure Size in a Seaborn distplot — Changing the Figure Size in a Seaborn displot

It’s incredibly simple to modify the size of your visualization. This can be very useful when dealing with data that are spread horizontally or vertically while reducing whitespace.

Conclusion

In this tutorial, you learned how to use the Seaborn displot() function to create figure-level relational visualizations. The function allows you to easily create distribution plots, including histograms and kernel density estimate plots while providing a familiar and consistent interface.

You first learned how to create simple figure-level objects, then worked through to more complex examples by adding additional detail using color. From there, you learned how to create small multiples by adding rows and columns of charts. Finally, you learned how to customize the visualizations by modifying titles, axis labels, and the size of the visual.

Additional Resources

To learn more about related topics, check out the resources below:

Seaborn displot – Distribution Plots in Python

Understanding the Seaborn displot() Function

Loading a Sample Dataset

Creating a Basic displot with Seaborn

Creating a Kernel Density Estimate Plot with Seaborn displot

Modifying Seaborn displot with Color

Adding Color to Seaborn Displot

Adding Color Styles versus Adding Color Dimensions

Combining a Histogram with a KDE Plot in Seaborn displot

Plotting a Bivariate Distribution in Seaborn displot

Adding a Rugplot to a Seaborn displot

Creating Subsets of Plots with Rows and Columns

Adding Columns to Seaborn displot

Adding Rows to Seaborn displot

Changing Titles and Axis Labels in Seaborn displot

Adding a Title to a Seaborn displot

Adding Titles to Rows and Columns in Seaborn displot

Changing Axis Labels in Seaborn displot

Changing the Size of a Seaborn Replot

Conclusion

Additional Resources

Nik Piepenbreier

Leave a Reply Cancel reply

Seaborn displot – Distribution Plots in Python

Understanding the Seaborn displot() Function

Loading a Sample Dataset

Creating a Basic displot with Seaborn

Creating a Kernel Density Estimate Plot with Seaborn displot

Modifying Seaborn displot with Color

Adding Color to Seaborn Displot

Adding Color Styles versus Adding Color Dimensions

Combining a Histogram with a KDE Plot in Seaborn displot

Plotting a Bivariate Distribution in Seaborn displot

Adding a Rugplot to a Seaborn displot

Creating Subsets of Plots with Rows and Columns

Adding Columns to Seaborn displot

Adding Rows to Seaborn displot

Changing Titles and Axis Labels in Seaborn displot

Adding a Title to a Seaborn displot

Adding Titles to Rows and Columns in Seaborn displot

Changing Axis Labels in Seaborn displot

Changing the Size of a Seaborn Replot

Conclusion

Additional Resources

Nik Piepenbreier

Leave a Reply Cancel reply

Thank you!