Rename Pandas Columns with Pandas .rename()

Pandas Rename Columns Cover Image

Learn how to use the Pandas .rename() method to easily rename Pandas columns! You’ll learn a number of different ways to rename your columns, to meet your needs exactly!

You’ll learn how to deal with those files that you get sent with meaningless column names. After reading this post, you’ll be able to rename your columns in a number of different ways, in order to best meeting your situation.

Loading our data

Let’s begin by loading our dataset. We’ll use pandas to create the dataframe that we’ll use throughout the tutorial. Let’s get started:

import pandas as pd

df = pd.DataFrame.from_dict(
    {
        'Name': ['Jane', 'Melissa', 'John', 'Matt'],
        'Age': [23, 45, 35, 64],
        'Age Group': ['18-35', '35-50', '35-50', '65+'],
        'Birth City': ['London', 'Paris', 'Toronto', 'Atlanta'],
        ' Gender of person  ': ['Female', 'Female', 'Male', 'Male']
    }
)

print(df)

This returns the following dataframe:

      Name  Age Age Group Birth City  Gender of person  
0     Jane   23     18-35     London              Female
1  Melissa   45     35-50      Paris              Female
2     John   35     35-50    Toronto                Male
3     Matt   64       65+    Atlanta                Male

Let’s see how we can explore the columns of the dataframe using the .columns attribute.

By printing out the df.columns attribute, all the columns are returned:

print(df.columns)

This returns the following:

Index(['Name', 'Age', 'Age Group', 'Birth City', ' Gender of person  '], dtype='object')

We can see that there are a number of quirks with the column names (such as leading and additional spaces).

Overview of Pandas .rename() method

The Pandas .rename() method alters axes labels – either for rows or columns. We’ll focus on the columns item for this tutorial.

You can pass in mappers (either functions or dictionaries) or lists to change them entirely. If you do use a mapper, values must be unique (i.e., a 1-to-1 match) and ignores missing values (leaving them as-is).

Check out some other Python tutorials on datagy, including our complete guide to styling Pandas and our comprehensive overview of Pivot Tables in Pandas!

How to rename a single Pandas column

To rename a single column, we can approach this in multiple ways. The easiest way would be to pass in a dictionary with just a single key:value pair.

Renaming a single column by name

For example, if we wanted to rename the Age Group column to age_group, we could write:

df = df.rename(columns={'Age Group': 'age_group'})

print(df.columns)

This returns the following:

Index(['Name', 'Age', 'age_group', 'Birth City', ' Gender of person  '], dtype='object')

Renaming a single column by position

Now, say you didn’t know what the first column was called, but you knew you wanted to change it. You could pass in the indexed item of the list returned by calling the .columns attribute. If you wanted to change the first column to id, you could write:

df = df.rename(columns={df.columns[0]: 'id'})

print(df.columns)

This returns:

Index(['id', 'Age', 'Age Group', 'Birth City', ' Gender of person  '], dtype='object')

What we’ve done here is pass in the value in the first position of list of values of the column names as the key.

How to rename multiple Pandas columns

To rename multiple columns in Pandas, we can simply pass in a larger list of key:value pairs. We can even combine the two methods above.

Let’s give this a shot. We’ll rename the first column id and we’ll lower case the Age and Age Group columns.

df = df.rename(columns={
    df.columns[0]: 'id', 
    'Age': 'age',
    'Age Group': 'age group'})

print(df.columns)

This returns the following:

Index(['id', 'age', 'age group', 'Birth City', ' Gender of person  '], dtype='object')

How to use a list comprehension to rename Pandas columns

There may be many times when you’re working on a large dataset and you want to streamline column names. For example, spaces can be particularly annoying when trying to use dot notation to access columns. Another common annoyance can be having confusing casing since Pandas indexing is case-sensitive.

Let’s first see how we can remove extra spaces from our columns, replace inline spaces with underscores, and lowercase all our column names.

To learn more about list comprehensions, check out my comprehensive tutorial, which is also available in video form.

This method is particularly helpful if you’re attempting to make multiple transformations consistently across all columns.

df.columns = [column.strip().replace(' ', '_').lower() for column in df.columns]

print(df.columns)

This returns the following:

Index(['name', 'age', 'age_group', 'birth_city', 'gender_of_person'], dtype='object')

What we’ve done is applied the following transformations:

  • .strip() removes any trailing and leading spaces,
  • .replace() makes our space substitutions, and
  • .lower() lowercases our columns

Using a mapper function to rename Pandas columns

You can also use mapper functions to rename Pandas columns.

Say we simply wanted to lowercase all of our columns, we could do this using a mapper function directly passed into the .rename() method:

df = df.rename(mapper=str.lower, axis='columns')

print(df.columns)

We use axis='columns' to specify that we want to apply this transformation on the columns. Similarly, you could write: axis=1.

This returns:

Index(['name', 'age', 'age group', 'birth city', ' gender of person  '], dtype='object')

Using a lambda function to rename Pandas columns

You can also use lambda functions to pass in more complex transformations, as we did with our list comprehension. Say we wanted to replicate that example (by removing leading/trailing spaces, replacing inline spaces with underscores, and lowercasing everything), we could write:

df = df.rename(mapper=lambda x: x.strip().replace(' ', '_').lower(), axis=1)

print(df.columns)

We use axis=1 to specify that we want to apply this transformation on the columns. Similarly, you could write: axis='columns'.

This returns the following:

Index(['name', 'age', 'age_group', 'birth_city', 'gender_of_person'], dtype='object')

Using Inplace to Rename Pandas Columns in place

You may have noticed that for all of our examples we have reassigned the dataframe (to itself). We can avoid having to do this by using the boolean inplace= parameter in our method call. Let’s use our previous example to illustrate this:

df.rename(mapper=lambda x: x.strip().replace(' ', '_').lower(), axis=1, inplace=True)

Raising errors while renaming Pandas columns

By default, the .rename() method will not raise any errors when you include a column that doesn’t exist. This can lead to unexpected errors, when you assume that a column has been renamed, when in actuality it hasn’t.

Let’s see this in action by attempting to rename a column that doesn’t exist:

df = df.rename(columns={'some silly name': 'column1'}, errors='raise')

print(df.columns)

This returns the following error:

Traceback (most recent call last):
  File "/Users/nikpi/Desktop/Rename Your Columns.py", line 13, in <module>
    df = df.rename(columns={'some silly name': 'column1'}, errors='raise')
  File "/Users/nikpi/Library/Python/3.8/lib/python/site-packages/pandas/util/_decorators.py", line 312, in wrapper
    return func(*args, **kwargs)
  File "/Users/nikpi/Library/Python/3.8/lib/python/site-packages/pandas/core/frame.py", line 4438, in rename
    return super().rename(
  File "/Users/nikpi/Library/Python/3.8/lib/python/site-packages/pandas/core/generic.py", line 1054, in rename
    raise KeyError(f"{missing_labels} not found in axis")
KeyError: "['some silly name'] not found in axis"

Renaming Multi-index Pandas Columns

The .rename() method also include an argument to specify which level of a multi-index you want to rename. Say we create a Pandas pivot table and only want to rename a column in the first layer, we could write:

To learn more about Pandas pivot tables, check out my comprehensive overview (complete with a video tutorial!).

import pandas as pd

df = pd.DataFrame.from_dict(
    {
        'Name': ['Jane', 'Melissa', 'John', 'Matt'],
        'Age': [23, 45, 35, 64],
        'Age Group': ['18-35', '35-50', '35-50', '65+'],
        'Birth City': ['London', 'Paris', 'Toronto', 'Atlanta'],
        ' Gender of person  ': ['Female', 'Female', 'Male', 'Male']
    }
)

df.columns = [column.strip().replace(' ', '_').lower() for column in df.columns]

pivot = pd.pivot_table(
    data=df,
    columns=['gender_of_person', 'age_group'],
    values='age',
    aggfunc='count'
)

pivot = pivot.rename(columns={'Male':'male'}, level=0)

print(pivot)

This returns the following dataframe:

gender_of_person Female        male    
age_group         18-35 35-50 35-50 65+
age                   1     1     1   1

Conclusion

In this post, you learned about the different ways to rename columns in a Pandas dataframe. You learned how to be specific about which columns to rename, how to apply transformations to all columns, and how to rename only columns in a specific level of a MultiIndex dataframe.

To learn more about the Pandas .rename() method, check out the official documentation.