In this tutorial, you’ll learn how to create Seaborn distribution plots using the sns.displot() function. Distribution plots show how a variable (or multiple variables) is distributed. Seaborn provides many different distribution data visualization functions that include creating histograms or kernel density estimates.
Seaborn provides dedicated functions for both of these visualizations. So, why would you want to use the displot() function? The Seaborn displot()
function is a figure-level function, rather than an axes-level function. This opens up different possibilities in terms of how you put together your visualizations.
By the end of this tutorial, you’ll have learned the following:
- What the Seaborn displot() function is
- When to use the Seaborn displot() function instead of the dedicated functions
- How to plot multiple plots using the
sns.displot()
figure-level function - How to customize titles, colors, and more
Table of Contents
Understanding the Seaborn displot() Function
The Seaborn displot() function is used to create figure-level relational plots onto a Seaborn FacetGrid. You can customize the type of visualization that is created by using the kind=
parameter.
The Seaborn displot()
function provides a figure-level interface for creating categorical plots. This means that the function allows you to map to a figure, rather than an axes object. This opens up much more possibilities.
Let’s take a look at how the function is written:
# Understanding the Seaborn displot() Function
import seaborn as sns
sns.displot(data=None, *, x=None, y=None, hue=None, row=None, col=None, weights=None, kind='hist', rug=False, rug_kws=None, log_scale=None, legend=True, palette=None, hue_order=None, hue_norm=None, color=None, col_wrap=None, row_order=None, col_order=None, height=5, aspect=1, facet_kws=None, **kwargs)
The function allows you to plot the following visualization types, modified by the kind=
parameter:
- Histograms with the Seaborn histplot() function
- Kernel density estimate plots with the Seaborn kdeplot() function
- Empirical cumulative distribution function plots with the Seaborn ecdfplot() function
- Rugplots with the Seaborn rugplot() function
Some of these visualizations are a little bit more specific and niche. The image below shows what a similar distribution looks like using different plots:
The function has a very similar interface to the other distribution plotting functions. Let’s take a look at some of the key options:
data=
provides the data to plot via a Pandas DataFramex=
andy=
provide the variables to plot on the x- and y-axis respectivelyhue=
adds an additional variable to plot via a color mapping
Additionally, the function offers some extra parameters available only in the displot()
function. Let’s explore these:
kind=
determines what type of chart to create. By default, it will create a histogram, using the keyword argument'hist'
row=
allows you to split your dataset into additional rows of visualizationscol=
allows you to split your dataset into additional columns of visualizationsheight=
andaspect=
control the size of your data visualization
Now that you have a strong understanding of what’s possible, let’s dive into how we can use the function to create useful data visualizations.
Loading a Sample Dataset
To follow along with this tutorial, let’s use a dataset provided by the Seaborn library. We’ll use the popular Tips dataset available through the sns.load_dataset()
function.
Let’s see how we can read the dataset and explore its first five rows:
# Exploring the Sample Dataset
import seaborn as sns
df = sns.load_dataset('tips')
print(df.head())
# Returns:
# total_bill tip sex smoker day time size
# 0 16.99 1.01 Female No Sun Dinner 2
# 1 10.34 1.66 Male No Sun Dinner 3
# 2 21.01 3.50 Male No Sun Dinner 3
# 3 23.68 3.31 Male No Sun Dinner 2
# 4 24.59 3.61 Female No Sun Dinner 4
We can see that we have a variety of variables available to us, including some categorical ones as well as some continuous ones.
Creating a Basic displot with Seaborn
By default, the Seaborn displot()
function will create a histogram. In order to create the most basic visualization, we can simply pass in the following parameters:
data=
to pass in our DataFramex=
ory=
to pass in the column labels that we want to explore in a histogram
Let’s see what this code looks like:
# Seaborn displot() Will Default to Show a Histogram
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')
sns.displot(data=df, x='tip')
plt.show()
In the code block above, we passed in our DataFrame df
as well as the 'tip'
column label. This returned the following visualization:
We can see that because we’re plotting a single variable along the x-axis and Seaborn returns a histogram. The plot allows us to explore the distribution of the data in that column.
Creating a Kernel Density Estimate Plot with Seaborn displot
While the Seaborn displot()
function will default to creating histograms, we can also create KDE plots by passing in kind='kde'
. Let’s see what this looks like:
# Use kind= to Modify the Type of Plot Used
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')
sns.displot(data=df, x='tip', kind='kde')
plt.show()
In the code block above, we added one additional keyword argument: kind=
. This allowed us to create an entirely different data visualization, as shown below:
Because the displot()
function will actually use the kdeplot()
function under the hood, the behavior is the same. This means that we can use the different keyword arguments that the kdeplot() function provides.
Modifying Seaborn displot with Color
We can add additional detail to our Seaborn graphs by using color. This allows you to add additional dimensions (or columns of data) to your visualization. This means that, while our graphs will remain 2-dimensional, we can actually plot additional dimensions.
We can add these using the hue=
parameter to add additional parameters in color. Let’s explore how we can add additional levels of detail using color.
Adding Color to Seaborn Displot
To add an additional variable into your Seaborn displot()
, you can use the hue=
parameter to pass in a DataFrame column that will break the data into multiple colors.
Seaborn will create a color for each of the different unique values in that column. If you’re working with categorical data, Seaborn will add one color for each unique value.
Adding Color Styles versus Adding Color Dimensions
In this case, we’ll be adding color to represent a different dimension of data. If, instead, you wanted to control the styling of your plot, you could use the palette=
parameter. For the remainder of the tutorial, we’ll apply a style to make the default styling a little more aesthetic.
Let’s see how we can use Seaborn to add more detail to our plot using the hue=
parameter:
# Add Color Using hue
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')
sns.displot(data=df, x='tip', hue='sex', multiple='stack')
plt.show()
In the code block above, we passed in hue='sex'
. This means that we want to color the points in our scatterplot differently based on the gender of the staff. This returns the following image:
We can see that the data visualization is now much clearer. We can clearly see differences in the data better. You can learn more about how to control color in Seaborn histograms by checking out my complete tutorial.
Combining a Histogram with a KDE Plot in Seaborn displot
We can easily combine the histogram that the Seaborn displot()
function creates with a kernel density estimate. Rather than creating a separate axes object, the function allows you to pass in kde=True
which will draw the estimate on the histogram.
# Combine a Histogram with the KDE
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')
sns.displot(data=df, x='tip', kde=True)
plt.show()
In the code block above, we created a simple histogram with the displot()
function but instructed the function to draw the kernel density estimate. This returns the image below:
In the following section, you’ll learn how to plot a bivariate distribution using the Seaborn displot function.
Plotting a Bivariate Distribution in Seaborn displot
By default, Seaborn will plot the distribution of a single variable. However, we can plot bivariate distribution plots using the displot function simply by passing column labels into both the x=
and y=
parameters.
# Drawing Bivariate Distributions
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')
sns.displot(data=df, x='tip', y='total_bill')
plt.show()
In the code block above, we didn’t specify what type of graph we wanted to produce. Because of this, Seaborn defaults to a histogram, which returns a heat map of the distribution along the two variables.
Similarly, we can plot a bivariate kernel density estimate. In this case, we need to specify this using the kind=
parameter.
# Drawing Bivariate Distributions with a KDE
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')
sns.displot(data=df, x='tip', y='total_bill', kind='kde')
plt.show()
In the code block above, we added in kind='kde'
which plots a kernel density estimate. When plotting a bivariate distribution, this returns the image below:
Because we’re creating a figure-level object, we can also customize the visualization by adding a rugplot directly. This is what you’ll learn in the following section.
Adding a Rugplot to a Seaborn displot
Rather than needing to call the rugplot function explicitly, Seaborn’s displot allows you to add a rug plot using the rug=True
argument. This simplifies the creation of the plot and makes the code significantly cleaner.
# Adding a Rugplot
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')
sns.displot(data=df, x='tip', kind='kde', rug=True)
plt.show()
In the code block above, we passed in rug=True
, which plotted a rug plot on the histogram, as shown below:
In the following section, you’ll learn how to create small multiples of plots in Seaborn.
Creating Subsets of Plots with Rows and Columns
Seaborn provides significant flexibility in creating subsets of plots (or, subplots) by spreading data across rows and columns of data. This allows you to generate “small-multiples” of plots.
Rather than splitting a visualization using color or style (though you can do this, too), Seaborn will split the visualization into multiple subplots. However, rather than needing to explicitly define the subplots, Seaborn will plot them onto a figure FacetGrid for you.
Let’s now explore how we can add columns of data visualizations first.
Adding Columns to Seaborn displot
In order to create columns of subplots, we can use the col=
parameter. The parameter accepts either a Pandas DataFrame column label or an array of data. Let’s split our data visualization into columns based on the stock that they belong to:
# Creating Columns of Small Multiples
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')
sns.displot(data=df, x='tip', col='day')
plt.show()
In the code block above, we instructed Seaborn to create columns of small multiples with the 'day'
column. This means that Seaborn will create an individual subplot in the broader FacetGrid for each unique value in the 'day'
column.
But, what happens when we have a lot of unique values? Seaborn will actually keep adding more and more columns.
Because of this, we can wrap the columns using the col_wrap=
parameter. The parameter accepts an integer representing how many columns we should have before the charts are wrapped down to another row.
# Adding Column Wrapping for Small Multiples
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')
sns.displot(data=df, x='tip', col='day', col_wrap=3)
plt.show()
This returns the following data visualization, where our small multiples have been wrapped around the second column:
In the following section, you’ll learn how to also add additional rows of visualizations.
Adding Rows to Seaborn displot
Seaborn also allows you to pass in rows of small multiples. This works in the same way as adding columns. However, you can also combine the rows=
parameter with the col=
parameter to create rows and columns of small multiples.
Let’s see what this looks like:
# Creating Rows and Columns of Small Multiples
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')
sns.displot(data=df, x='tip', row='sex', col='day')
plt.show()
In the code block above, we passed in row='sex'
and col='day'
to split the small multiples based on both of these columns. This returns the following data visualization:
Let’s now take a look at how we can customize the data visualizations by adding titles and axis labels in our charts.
Changing Titles and Axis Labels in Seaborn displot
Adding titles and descriptive axis labels is a great way to make your data visualization more communicative. In many cases, your readers will want to know specifically what a data point and graph represent. Because of this, it’s important to understand how to customize these in Seaborn.
Adding a Title to a Seaborn displot
To add a title to a Seaborn displot()
, we can use the fig.suptitle()
method available in Matplotlib. In order to do this, we’ll need to first adjust the spacing of our figure object. This process can be a bit heuristic and require some trial and error.
Take a look at the code block below:
# Adding a Title to a displot
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')
dist = sns.displot(data=df, x='tip', row='sex', col='day')
dist.fig.subplots_adjust(top=0.92)
dist.fig.suptitle('Comparing Tip Amounts')
plt.show()
In the code block above, we made a number of important changes:
- We filtered the DataFrame to make the visual easier to see
- We assigned the
displot
to a variable,dist
- We then adjusted the top margin using
fig.subplots_adjust()
- Then, we passed in a
suptitle()
onto the figure object
This returned the following data visualization:
Similarly, we can customize the titles of each of the subplots that we create. Let’s take a look at that next.
Adding Titles to Rows and Columns in Seaborn displot
Seaborn provides incredibly flexible formatting options for styling small multiples created with the col=
and row=
parameters.
# Changing Small Multiple Titles
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')
dist = sns.displot(data=df, x='tip', row='sex', col='day')
dist.fig.subplots_adjust(top=0.92)
dist.fig.suptitle('Comparing Tip Amounts')
dist.set_titles(row_template='Staff Gender: {row_name}', col_template='Day of Week: {col_name}')
plt.show()
In the code block above, we used the .set_titles()
method which is available to FacetGrid objects. The method allows you to use the row_template=
and col_template=
parameters which allow you to access the col_name
and row_name
variables in f-string like formatting.
This returns the data visualization below:
In the following section, you’ll learn how to customize the axis labels in a Seaborn displot
.
Changing Axis Labels in Seaborn displot
By default, Seaborn will use the column labels as the axis labels in the visualization. In many cases, however, this isn’t a very descriptive title to use. Because the displot()
function returns a FacetGrid object, we can use helper methods to solve this, including:
.set_xlabel()
which sets the x-axis label.set_ylabel()
which sets the y-axis label.set_axis_labels()
which sets both the x- and y-axis labels at once
Let’s see what this looks like in Seaborn:
# Adding Axis Labels to Small Multiples
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')
dist = sns.displot(data=df, x='tip', row='sex', col='day')
dist.fig.subplots_adjust(top=0.92)
dist.fig.suptitle('Comparing Tip Amounts')
dist.set_titles(row_template='Staff Gender: {row_name}', col_template='Day of Week: {col_name}')
dist.set_xlabels('Tip Amount')
dist.set_ylabels('# of Transactions')
plt.show()
In the code block above, we added two additional lines of code toward the end to customize the axis labels of our data visualization. This returns the following data visualization:
In the section below, you’ll learn how to change the size of a Seaborn displot
.
Changing the Size of a Seaborn Replot
Because the Seaborn displot()
function returns a FacetGrid object, we can easily modify the size of the figure object that is returned. In order to do this, we can use the two following parameters:
height=
which determines the height in inches of each facetaspect=
which determines the aspect ratio, so that the width isheight * aspect
Let’s see how we can change the size of a simpler data visualization in Seaborn:
# Changing a Figure Size
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('tips')
sns.displot(data=df, x='tip', kde=True, height=5, aspect=1.6)
plt.show()
In the code block above, we passed in height=5, aspect=1.6
. This means that the height of the facet will be 5 inches, while the width will be 8 inches (5 * 1.6). This returns the following data visualization:
It’s incredibly simple to modify the size of your visualization. This can be very useful when dealing with data that are spread horizontally or vertically while reducing whitespace.
Conclusion
In this tutorial, you learned how to use the Seaborn displot()
function to create figure-level relational visualizations. The function allows you to easily create distribution plots, including histograms and kernel density estimate plots while providing a familiar and consistent interface.
You first learned how to create simple figure-level objects, then worked through to more complex examples by adding additional detail using color. From there, you learned how to create small multiples by adding rows and columns of charts. Finally, you learned how to customize the visualizations by modifying titles, axis labels, and the size of the visual.
Additional Resources
To learn more about related topics, check out the resources below: