Skip to content

Seaborn swarmplot: Bee Swarm Plots for Distributions of Categorical Data

Seaborn swarmplot Beeswarm Plots for Distributions of Categorical Data Cover Image

The Seaborn swarmplot function allows you to create data visualizations that easily and effectively show the numeric distribution of data over categories. There are many functions that allow you to do this: the boxplot and violin plots are two of these functions, but they can be intimidating to non-technical audiences.

This is where the swarm plot (or beeswarm plot) can be very helpful! In fact, the swarm plot can be combined with the box and whisker plot or the violin plot to add additional detail. In this guide, you’ll learn how to use the Seaborn swarmplot function to create beeswarm scatter plots to easily graph numerical distributions over categorical variables.

By the end of this tutorial, you’ll have learned the following:

  • How to understand and use the Seaborn sns.swarmplot() function
  • When swarm plots are a good alternative to simple scatter plots, box plots, and violin plots
  • How to customize Seaborn swarm plots to add more detail with color and marker shapes
  • How to customize labels, titles, and more

What are Swarm Plots and When Would You Want to Use Them?

Before diving into creating Seaborn swarm plots, let’s dive into what they are why they are useful. Visualizing numeric distributions over categorical variables can be a challenging task. For example, using techniques like bar plots means you need to aggregate values into a single value such as the mean of a distribution. Using visualizations such as scatter plots leads to results that can be hard to understand (just take a look at the top right visual below).

Comparing Swarm Plots to Different Data Visualizations
Comparing Swarm Plots to Different Data Visualizations

Swarm plots are similar to stripplots, however they add spread to the data points. Similar to strip plots, they allow you to better understand the distribution of data along concentrated areas. However, they add even more dimension than strip plots, making it easier to see data points without overlap.

Understanding the Seaborn swarmplot Function

The Seaborn swarmplot() function offers many different parameters. These allow you to customize the plots to a significant extent. In fact, there are so many parameters that we’ll focus primarily on the ones you’ll use the most:

seaborn.swarmplot(data=None, *, x=None, y=None, hue=None, order=None, hue_order=None, dodge=False, orient=None, color=None, palette=None, size=5, edgecolor='gray', linewidth=0, hue_norm=None, native_scale=False, formatter=None, legend='auto', warn_thresh=0.05, ax=None, **kwargs)

We can see that there are a large number of parameters. Let’s break down the important parameters of the Seaborn swarmplot() function. We’ll use these parameters throughout the tutorial:

  • data= defines the data that we want to use, such as a Pandas DataFrame
  • x= and y= define the data series or column labels that we want to use for the x-axis and y-axis respectively
  • hue= adds another dimension to the plot by using different colors for different variables
  • dodge= allows you to split colors along the same category into two different groups
  • alpha= allows you to modify the transparency of the dots. Note that this isn’t an explicit parameter of the function, but can be passed in using the additional keyword arguments of the Matplotlib scatterplot function.

Now that you have a good understanding of the important parameters of the Seaborn swarmplot function, let’s dive into creating a plot with the function.

How to Create a Swarm Plot in Seaborn

Because Seaborn uses a common structure for its functions, creating a swarm plot is simple and intuitive. We can simply pass a DataFrame and column labels into the data=, x=, and y= parameters of the sns.swarmplot() function. Let’s see how we can create a simple bee swarm plot in Seaborn:

# How to Create a Seaborn Split Plotimport seaborn as snsimport matplotlib.pyplot as pltdf = sns.load_dataset('penguins')sns.stripplot(data=df, x='island', y='bill_length_mm')plt.show()

In the code block above, we passed our DataFrame, df, into the data= parameter. Similarly, because we’re using a Pandas DataFrame, we can pass in the column labels as strings into the respective x and y-axis parameters. This returns the following image:

Creating a Simple Seaborn Swarm Plot
Creating a Simple Seaborn Swarm Plot

In the following section, you’ll learn how to add additional variables to swarm plots using additional colors.

How to Add Color for Additional Variables in Seaborn Swarm Plots

In the swarm plot we generated above, we created a plot that contained information on two dimensions. We can add a third variable to the plot by using the hue= parameter. This adds an additional column of data by splitting the variable into different colors.

# How to Add Another Variable with Color to a Seaborn Split Plotimport seaborn as snsimport matplotlib.pyplot as pltdf = sns.load_dataset('penguins')sns.stripplot(data=df, x='island', y='bill_length_mm', hue='sex')plt.show()

In the code above, we added hue='sex', which split the values in that column into different colors. This returns the image below, where the points have been split into different colors.

Adding Another Dimension to Seaborn Swarm Plots with hue
Adding Another Dimension to Seaborn Swarm Plots with hue

We can see that the dots have been added. However, because they are overlapping in categories, it can be difficult to see where they are clustered. You’ll learn how to address this in the following section.

How to Dodge Different Colors in Seaborn Swarm Plots

By default, the dots in a strip plot are overlapping into different categories. This can make seeing the different values’ clusters difficult. In order to address this, we can separate the groups into different clusters. That way, we can see how many data points fall into each subcategory. In order to do this, we set the boolean dodge= parameter to True.

# How to Change the Transparency of a Seaborn Split Plotimport seaborn as snsimport matplotlib.pyplot as pltdf = sns.load_dataset('penguins')sns.stripplot(data=df, x='island', y='bill_length_mm', hue='sex', alpha=0.35)plt.show()

In the code block above, we modified the dodge= parameter for dots to split into respective groups. This returned the image below:

Dodging Categories of Colors in Seaborn Swarm Plots
Dodging Categories of Colors in Seaborn Swarm Plots

In the following section, you’ll learn how to use swarm plots by modifying the transparency.

How to Modify the Transparency of Seaborn Swarm Plots

In some cases, swarm plot points will overlap. In these cases, it isn’t possible to see how many points are actually clustered in a given position. In these cases, we can modify the transparency of individual points to better understand clustering.

Let’s see how we can use the alpha= parameter to modify the transparency. The argument accepts a float between 0.0 and 1.0, where lower values are more transparent.

# How to Change the Transparency of a Seaborn Strip Plot
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('penguins')
sns.set_palette('Set2')
sns.swarmplot(data=df, x='island', y='bill_length_mm', alpha=0.35)
plt.show()

This returns the following image below, where we have successfully modified the transparency of individual items.

Modifying Transparency in Seaborn Swarm Plots
Modifying Transparency in Seaborn Swarm Plots

In the following section, you’ll learn how to use swarm plots creatively by adding them to violin plots.

How to Add Swarm Plots to Seaborn Violin Plots

We can combine a swarm plot with a Seaborn violin plot to help users better understand the violin plot. Seaborn makes this simple by simply calling both functions. Seaborn will then add the plots to the same axes object. Let’s see what this looks like:

# How to Overlay a Strip Plot over a Violin Plot in Seabornimport seaborn as snsimport matplotlib.pyplot as pltdf = sns.load_dataset('penguins')sns.violinplot(data=df, x='island', y='bill_length_mm', )sns.stripplot(data=df, x='island', y='bill_length_mm', alpha=0.25, color='black')plt.show()

By calling both the violin plot and swarm plot functions, Seaborn maps both of these to the same visualization. This returns the data visualization below:

Adding Seaborn Swarm Plots to Violin Plots
Adding Seaborn Swarm Plots to Violin Plots

We can see that this allows users to better understand the underlying distribution of the Seaborn violin plot.

How to Change the Palette of a Seaborn Swarm Plot

Seaborn makes changing the color palette of your plot very simple. One of the ways in which you can do this is by calling the sns.set_palette() function and passing in a named palette. Let’s see how we can use the 'Set2' palette, which returns a more muted, pastel color scheme.

# How to Change the Palette of a Seaborn Strip Plotimport seaborn as snsimport matplotlib.pyplot as pltdf = sns.load_dataset('penguins')sns.set_palette('Set2')sns.stripplot(data=df, x='island', y='bill_length_mm', hue='sex', alpha=0.35)plt.show()

By passing in the palette 'Set2', Seaborn creates the following image of our swarmplot.

Changing the Palette in a Seaborn Swarm Plot
Changing the Palette in a Seaborn Swarm Plot

In the following section, you’ll learn how to add titles and axis labels to your Seaborn beeswarm plot.

How to Add Titles and Axis Labels to Seaborn Swarm Plots

By default, Seaborn doesn’t include a title but will add column labels as the x- and y-axis labels. We can modify this by using the following axes methods:

  • ax.set_title() to set the title,
  • ax.set_xlabel() to set the x-axis label, and
  • ax.set_ylabel() to set the y-axis label

There are many different customization options which allow you to add flexibility to how your Seaborn titles and axis labels are styled. Let’s take a look at what this looks like:

# How to Add a Title and Axis Labels to a Seaborn Strip Plotimport seaborn as snsimport matplotlib.pyplot as plt
df = sns.load_dataset('penguins')
sns.set_palette('Set2')
chart = sns.stripplot(data=df, x='island', y='bill_length_mm', hue='sex', alpha=0.35)
chart.set_title('Bill Length by Island', fontdict={'size':18})
chart.set_xlabel('Bill Length (mm)', fontdict={'weight':'bold'})
chart.set_ylabel('Island', fontdict={'weight':'bold'})
plt.show()

We can see that we were able to add both a label and customize the label using the fontdict= parameter. The parameter allows you to pass in customizations based on Matplotlib text styling options.

Adding Titles and Axis Labels to Seaborn Swarm Plots
Adding Titles and Axis Labels to Seaborn Swarm Plots

We can see that this adds a lot more detail to the plots in Seaborn, making it much clearer to the reader what the plot is displaying.

Conclusion

In this tutorial, you learned how to create swarm plots in Seaborn using the sns.swarmplot() function. Swarm plots, or beeswarm plots, are scatter plots that display continuous data over categorical data. They extend on strip plots by adding more separation between data points.

You first learned how the parameters of the function work and what the most important parameters are. Then, you learned how to create a simple swarm plot and enhance it by modifying the colors used. Then, you learned to make the graph clearer by adding separation between colors. From there, you overlayed the swarm plot over a violin plot to make it easier to understand for non-technical audiences. Finally, you learned how to customize the plots by modifying the color palette and by adding titles and axis labels to the plot.

Additional Resources

To learn more about related topics, check out the resources below:

Leave a Reply

Your email address will not be published. Required fields are marked *