Unpivot Your Data with the Pandas Melt Function

  • by
Cover Image for Univot Your Data with Pandas Melt
  • Save

In this post, you’ll learn how to use the Pandas melt function.

The melt function is used to reshape a Pandas from wide to long format. What this means is that one or more columns are used as identifiers and all other columns are used as values.

Table of Contents

Why Use Pandas Melt to Unpivot Data?

This is helpful if you’re given data in a wide format, such as report you find online or you may have been given by a colleague. This data is easy to understand, but it’s harder to reshape into some other form of analysis.

Tip! Once your data has been reshaped, you can also Create Pivot Tables directly using Python and Pandas or discover descriptive statistics using python!

Loading a Sample Dataframe

Let’s begin by loading a sample dataframe in wide-format.:

import pandas as pd

df = pd.DataFrame({'name': ['John', 'Mel', 'Peter', 'Frank', 'Nik', 'Eva'],
    'age': [27, 32, 35, 31, 29, 30],
    'education': ['master', 'doctor', 'MBA', 'bachelor', 'MBA', 'bachelor'],
    'salary': [100000,120000, 85000, 75000, 95000, 80000]})

print(df.head())

This generates our dataset in a wide-format. The printout shows:

    name  age education  salary
0   John   27    master  100000
1    Mel   32    doctor  120000
2  Peter   35       MBA   85000
3  Frank   31  bachelor   75000
4    Nik   29       MBA   95000

What is the Pandas Melt Syntax?

The general format for the melt function looks like this:

pandas.melt(frame, id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None)

Let’s take a look at all of these in a bit more detail:

  • Frame: the data frame we’re looking to melt
  • id_vars: the column(s) we’re using as identifying variables
  • value_vars: the columns we’re looking to unpivot. If this isn’t specified, any column not in id_vars is used.
  • var_name: the name used for the variable column.
  • value_name: the name used for the value column. Will default to values.
  • col_level: if columns are a multi-index, use this to melt.

How to Unpivot Your Data Using the Pandas Melt Function

Let’s take a look at how we can use the Pandas melt function to unpivot the dataset.

From what we learned earlier, we need to reassign the dataframe:

melted = pd.melt(df, id_vars = 'name', var_name = 'Attribute', value_name = 'Value')

print(melted.head())

Notice that we didn’t include the value_vars. Since we wanted to unpivot all of our data, Pandas was able to assign all the columns.

This is the same as writing:

melted = pd.melt(df, id_vars = 'name', value_vars = ['age', 'education', 'salary'], var_name = 'Attribute', value_name = 'Value')

print(melted.head())

Both of these return the following output:

     name  Attribute     Value
0    John        age        27
1     Mel        age        32
2   Peter        age        35
3   Frank        age        31
4     Nik        age        29

Conclusion

Using just a few steps in Python, we were able to unpivot data using Pandas and the melt function to transform a cross-tab style, wide dataset into one that’s become much more useful for further analysis.

Find out more about the melt function by checking out its official documentation.

Cover of Introduction to Python for Data Science
  • Save

Check our our eBook to get started with Python for Data Science!