Skip to content

Seaborn Scatter Plots in Python: Complete Guide

Seaborn Scatter Plots in Python: Complete Guide

In this complete guide to using Seaborn to create scatter plots in Python, you’ll learn all you need to know to create scatterplots in Seaborn! Scatterplots are an essential type of data visualization for exploring your data. Being able to effectively create and customize scatter plots in Python will make your data analysis workflow much easier!

By the end of this tutorial, you’ll have learned how to use Seaborn to:

  • How to create scatter plots in Python with Seaborn
  • How to customize colors, markers, and sizes in Seaborn scatter plots
  • How to create 3D scatter plots and add regression lines to scatter plots
  • How to add titles and axis labels to your scatter plots

Understanding the Seaborn scatterplot Function

Before diving into how to create and customize scatterplots in Seaborn, it’s important to understand the scatterplot() function. This allows you to better understand how to use the function and what is possible with it. Let’s take a look at how the function can be used:

# Understanding the Seaborn scatterplot() Function
import seaborn as sns

sns.scatterplot(data=None, x=None, y=None, hue=None, size=None, style=None, palette=None, hue_order=None, hue_norm=None, sizes=None, size_order=None, size_norm=None, markers=True, style_order=None, legend='auto', ax=None)

We can see that the function offers a ton of different parameters.

By making good use of these parameters, we can create incredibly useful visualizations, such as the one shown below:

# What you'll be able to do at the end of this tutorial
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('penguins')

sns.set_style(style='whitegrid')

sns.scatterplot(
    data=df, 
    x='bill_length_mm', 
    y='bill_depth_mm', 
    hue='species',
    style='sex',
    palette='Paired_r'
    )

plt.title('Exploring Physical Attributes of Different Penguins')
plt.xlabel('Bill Length (mm)')
plt.ylabel('Bill Depth (mm)')
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left', borderaxespad=0)

plt.show()

Which returns the following image:

Final scatterplot in Seaborn
What you’ll learn throughout this tutorial

Let’s explore these parameters to better understand their behavior, including any default arguments that are passed in. The table below breaks down the parameters available in the sns.scatterplot() function:

ParameterDescriptionDefault ValueAccepted Values
data=The data structure to use, such as a Pandas DataFrameNonePandas DataFrame or NumPy Array
x=The variables that specify values on the x axisNoneThe vectors or keys in data
y=The variables that specify values on the y axisNoneThe vectors or keys in data
hue=A grouping variable that produces points of different colors (either categorical or numeric)NoneThe vectors or keys in data
size=A grouping variable that produces points of different size (either categorical or numeric)NoneThe vectors or keys in data
style=A grouping variable that produces points of different style (either categorical or numeric)NoneThe vectors or keys in data
palette=The method for choosing the colors to use when mappingNonestring, list, dict or Matplotlib colormap
hue_order=The order of processing and plotting for categorical levels of the hue semanticNonelist of strings
hue_norm=Either a pair of values that set the normalization range in data units or an object that will map to [0, 1] rangeNonetuple or matplotlib Normalize
sizes=An object that determines how sizes are chosenNonelist, dict or tuple
size_order=The specified order for appearance of the size variable levels.Nonelist
size_norm=Normalization in data units for scaling plot objects when the size variable is numericNonetuple or Normalize object
markers=Object determining how to draw markers for different levels of the style variable.Noneboolean, list or dictionary
style_order=Specific order for the appearance of the style variable.Nonelist
legend=How to draw the legend'auto''auto', 'brief', 'full', or False
ax=The pre-existing axes for the plot.Nonematplotlib Axes
The parameters of the Seaborn scatterplot() function

Seaborn scatterplot() can also be called using sns.relplot()

The sns.scatterplot() function is an axes-level function. If you need to draw figures instead, you can use the sns.relplot() function. This gives you more flexibility in drawing small multiples and controlling figure aesthetics.

How to Create Python Seaborn Scatter Plots

In this section, you’ll learn how to create Seaborn scatterplots using the scatterplot() function. For this tutorial, we’ll use a dataset that gives us enough flexibility to try out many of the different features available in the function. We can use the 'penguins' dataset found in Seaborn to try this out.

Let’s begin by loading the library and the dataset and then creating our first scatterplot:

# Loading the Penguins Dataset
import seaborn as sns
df = sns.load_dataset('penguins')
print(df.info())

# Returns:
# <class 'pandas.core.frame.DataFrame'>
# RangeIndex: 344 entries, 0 to 343
# Data columns (total 7 columns):
#  #   Column             Non-Null Count  Dtype  
# ---  ------             --------------  -----  
#  0   species            344 non-null    object 
#  1   island             344 non-null    object 
#  2   bill_length_mm     342 non-null    float64
#  3   bill_depth_mm      342 non-null    float64
#  4   flipper_length_mm  342 non-null    float64
#  5   body_mass_g        342 non-null    float64
#  6   sex                333 non-null    object 
# dtypes: float64(4), object(3)
# memory usage: 18.9+ KB
# None

We can see that the dataset comes with a number of different categorical and numerical columns, allowing us to try out a number of different, useful features.

Let’s now use the scatterplot() function to plot bill length and depth against one another:

# Creating Our First Scatterplot
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('penguins')

sns.scatterplot(data=df, x='bill_length_mm', y='bill_depth_mm')
plt.show()

This returns the following image:

Creating a Simple Seaborn Scatterplot
Creating a Simple Seaborn Scatterplot

By passing a Pandas DataFrame into the data= parameter, we were able to reference the columns of that DataFrame as strings. This method is declarative and allows us to abstract away from the complexity of working with Series data.

We can create the same scatterplot by writing:

# Alternative Method of Creating a Scatterplot in Seaborn
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('penguins')

sns.scatterplot(x=df['bill_length_mm'], y=df['bill_depth_mm'])
plt.show()

This code generates the same scatterplot. It works by passing in the Series of data that we want to use for creating our visualization, rather than using a declarative method.

In the following section, you’ll learn how to add color to scatterplots in Seaborn.

How to Add Color to Python Seaborn Scatter Plots with Hue

Currently, our scatterplot visualizes the distribution of two different variables. We can add in another variable by using color. This can be done using the hue= parameter, which also accepts the label of a column.

Depending on the type of variable you pass in, you’ll experience different behavior. The hue= parameter allows you to pass in:

  1. Categorical variables, where each color represents a categorical
  2. Continuous variables, where the color represents a gradient along the scale

Let’s first load in a categorical variable to see how we add in more dimensionality into our data:

# Adding a Categorical Color to Our Seaborn Scatterplot
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('penguins')

sns.scatterplot(data=df, x='bill_length_mm', y='bill_depth_mm', hue='species')
plt.show()

This returns the following visualization:

Adding Color Using Discrete Variables in Seaborn Scatterplots
Adding Color Using Discrete Variables in Seaborn Scatterplots

Because the data in the 'species' column are categorical, the colors represented in the scatterplot are broken out discretely. We can also see that a legend has been created.

We can also use the hue= parameter to pass in a continuous variable. Let’s see how our visualization changes by passing in the 'body_mass_g' variable:

# Adding a Continuous Color to Our Seaborn Scatterplot
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('penguins')

sns.scatterplot(data=df, x='bill_length_mm', y='bill_depth_mm', hue='body_mass_g')
plt.show()

We can see that by setting a continuous variable as the argument for the hue= parameter, that the following image is returned. The color changes to a gradient where the values move along a certain color map indicating the particular scale of a continuous variable.

Adding Color Using Continuous Variables in Seaborn Scatterplots
Adding Color Using Continuous Variables in Seaborn Scatterplots

In the following image, you’ll learn how to customize the marker size of markers in Seaborn.

How to Change Marker Size in Python Seaborn Scatter Plots

Seaborn also allows you to customize the size of markers using the size= parameter. By passing in a Pandas DataFrame column label, the sizes of the markers will adjust relative to the values in the column. Let’s see what this looks like:

# Adjusting the Marker Size in Seaborn Scatterplots
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('penguins')

sns.scatterplot(data=df, x='bill_length_mm', y='bill_depth_mm', size='body_mass_g')
plt.show()

This returns the following image:

Changing Marker Size in Seaborn Scatterplots
Changing Marker Size in Seaborn Scatterplots

We can see that the that marker sizes don’t show too much a difference. Seaborn allows us to define the relative sizes of the by passing in a tuple of sizes into the sizes= parameter. This allows us to pass in the minimum and maximum sizes, as shown below:

# Specifying the Size Ranges of Markers in Seaborn Scatterplots
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('penguins')

sns.scatterplot(data=df, x='bill_length_mm', y='bill_depth_mm', size='body_mass_g', sizes=(1,100))
plt.show()

This returns the following image:

Customizing Marker Size in Seaborn Scatterplots
Customizing Marker Size in Seaborn Scatterplots

In the following section, you’ll learn how to change markers in Seaborn scatter plots.

How to Change Markers in Python Seaborn Scatter Plots

Similar to modifying the color of markers in the scatter plots, we can modify the actual markers themselves. This has the added benefit of being more accessible and allowing you to print the visualizations in black and white.

We can do this by passing in a variable into the style= parameter. This can also be combined with the hue= parameter you learned about previously. This way, the variables will be colored and styles differently, allowing for better accessibility.

# Changing Marker Style in Seaborn
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('penguins')

sns.scatterplot(data=df, x='bill_length_mm', y='bill_depth_mm', style='species', hue='species')
plt.show()

This returns the following image:

Changing Markers in Seaborn Scatterplots
Changing Markers in Seaborn Scatterplots

We can see that this makes the resulting visualization much more accessible, especially for those who are color blind.

How to Add a Line to Python Seaborn Scatter Plots

By adding a line to a Seaborn scatterplot, you can visualize different regression trends between two variables. Because we’re really looking at analyzing the relationship between two variables from a standpoint of regression, we use the lmplot() function instead.

Let’s see how we can compare the bill length and depth and display a regression line in Seaborn:

# Adding a Regression Line to a Seaborn Scatter Plot
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('penguins')

sns.lmplot(data=df, x='bill_length_mm', y='bill_depth_mm')
plt.show()

This returns the following image:

Adding a Line to a Seaborn Scatterplot
Adding a Line to a Seaborn Scatterplot

In the following section, you’ll learn how to create 3D scatterplots in Seaborn.

How to Make 3D Scatterplots in Python Seaborn

In this section, you’ll learn how to create 3D scatter plots. Because Seaborn is built on top of Matplotlib, we can access many of the important aspects of the library. To learn about this process in more depth, check out my complete tutorial on create 3D scatter plots in Python with Seaborn and Matplotlib.

# Creating a 3D Scatter Plot in Python
import seaborn as sns
import matplotlib.pyplot as plt
from mpl_toolkits import mplot3d

fig = plt.figure()
ax = plt.axes(projection='3d')

df = sns.load_dataset('penguins')
x=df['bill_length_mm']
y=df['bill_depth_mm']
z=df['body_mass_g']

ax.scatter(x, y, z)
plt.show()

In the code above, we:

  1. Imported mpl_toolkits to be able to project onto 3D axes
  2. We then declared a fig and ax object in order to specify that we want to create a 3D projection
  3. Then, we defined our x, y, and z variables and loaded them into the Matplotlib scatter() function

This returns the following image:

Creating a 3D Scatterplot
Creating a 3D Scatterplot

In the following section, you’ll learn how to add multiple scatterplots in Python Seaborn.

Adding Multiple Scatterplots in Python Seaborn Using Facetgrid

We can use the Seaborn FacetGrid to add multiple scatterplots in Seaborn. This allows you to easily break out scatter plots across multiple variables. This means that you can better visualize how different elements are spread across variables.

Let’s see how we can use the Seaborn FacetGrid to plot multiple scatter plots:

# Using FacetGrid to Add Multiple Scatter Plots in Seaborn
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('penguins')

facet_grid = sns.FacetGrid(data=df, col='species', row='sex')
facet_grid.map(sns.scatterplot, 'bill_depth_mm', 'bill_length_mm')

plt.show()

This returns the following image:

Adding multiple scatterplots to a Seaborn facetgrid
Adding multiple scatterplots to a Seaborn facetgrid

In the following section, you’ll learn how to add a title to a Seaborn scatter plot.

How to Add a Title to a Python Seaborn Scatter Plots

Because Seaborn uses Matplotlib under the hood, we can use different features of Matplotlib to customize our visualizations. For example, we can add a title using Matplotlib. This can be done using the .title() function, as shown below:

# Adding a Title to a Seaborn Scatterplot
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('penguins')

sns.scatterplot(data=df, x='bill_length_mm', y='bill_depth_mm', style='species', hue='species')
plt.title('Understanding Bill Depth and Length')
plt.show()

This returns the following image:

Adding a title to a Seaborn Scatterplot
Adding a title to a Seaborn Scatterplot

In the following section, you’ll learn how to add axis labels to a Seaborn scatter plot.

How to Add Labels to Python Seaborn Scatter Plots

Similar to adding a title to a Seaborn plot, we can use Matplotlib to add x-axis and y-axis labels. This can be done using the plt.xlabel() and plt.ylabel() functions respectively.

Let’s see how we can add axis labels to our plot:

# Adding Axis Labels to Our Seaborn Scatterplot
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('penguins')

sns.scatterplot(data=df, x='bill_length_mm', y='bill_depth_mm', style='species', hue='species')
plt.xlabel('Bill Length (mm)')
plt.ylabel('Bill Depth (mm)')
plt.show()

This returns the following image:

12 - Adding Labels in Seaborn Scatterplots
Adding axis labels to a scatterplot in Seaborn

Conclusion

In this post, you learned how to use Seaborn to create scatterplots. You first learned how to use the function to create simple scatterplots and how to use the function to customize every aspect of your visualization. You then learned how to modify colors, sizes and markers in your plots. You also learned how to create 3D scatterplots and how to add a regression line.

Additional Resources

To learn more about related topics, check out the tutorials below:

Nik Piepenbreier

Nik is the author of datagy.io and has over a decade of experience working with data analytics, data science, and Python. He specializes in teaching developers how to use Python for data science using hands-on tutorials.View Author posts

Leave a Reply

Your email address will not be published. Required fields are marked *