Pandas Replace: Replace Values in Pandas Dataframe

Pandas Replace Values Cover Image

In this post, you’ll learn how to use the Pandas .replace() method to replace data in your dataframe. The Pandas dataframe.replace() function can be used to replace a string, values, and even regular expressions (regex) in your dataframe. It’s an immensely powerful function – so let’s dive right in!

Loading Sample Dataframe

To start things off, let’s begin by loading a Pandas dataframe. We’ll keep things simple so it’s easier to follow exactly what we’re replacing.

import pandas as pd

df = pd.DataFrame.from_dict(
    {
        'Name': ['Jane', 'Melissa', 'John', 'Matt'],
        'Age': [23, 45, 35, 64],
        'Birth City': ['London', 'Paris', 'Toronto', 'Atlanta'],
        'Gender': ['F', 'F', 'M', 'M']
    }
)

print(df)

This returns the following dataframe:

      Name  Age Birth City Gender
0     Jane   23     London      F
1  Melissa   45      Paris      F
2     John   35    Toronto      M
3     Matt   64    Atlanta      M

Check out some other Python tutorials on datagy, including our complete guide to styling Pandas and our comprehensive overview of Pivot Tables in Pandas!

Pandas Replace Method Syntax

The Pandas .replace() method takes a number of different parameters. Let’s take a look at them:

DataFrame.replace(
    to_replace=None, 
    value=None, 
    inplace=False, 
    limit=None, 
    regex=False, 
    method='pad')

Let’s take a closer look at what these actually mean:

  • to_replace: take a string, list, dictionary, regex, int, float, etc. and describes the values to replace
  • value: The value to replace with
  • inplace: whether to perform the operation in place
  • limit: the maximum size gap to backward or forward fill
  • regex: whether to interpret to_replace and/or value as regex
  • method: the method to use for replacement

Replace a Single Value in Pandas

Let’s learn how to replace a single value in a Pandas column.

In the example below, we’ll look to replace the value Jane with Joan:

df['Name'] = df['Name'].replace(to_replace='Jane', value='Joan')

print(df)

This returns the following dataframe:

      Name  Age Birth City Gender
0     Joan   23     London      F
1  Melissa   45      Paris      F
2     John   35    Toronto      M
3     Matt   64    Atlanta      M

Replace Multiple Values with the Same Value in Pandas

Now, you may want to replace multiple values with the same value. This is also extremely easy to do using the .replace() method.

Of course, you could simply run the method twice, but there’s a much more efficient way to accomplish this. Here, we’ll look to replace London and Paris with Europe:

df['Birth City'] = df['Birth City'].replace(
    to_replace=['London', 'Paris'], 
    value='Europe')

print(df)

This returns the following dataframe:

      Name  Age Birth City Gender
0     Jane   23     Europe      F
1  Melissa   45     Europe      F
2     John   35    Toronto      M
3     Matt   64    Atlanta      M

Now let’s take a look at how to replace multiple values with different values.

Replace Multiple Values with Different Values in Pandas

Similar to the example above, you can replace a list of multiple values with a list of different values.

This is as easy as loading in a list into each of the to_replace and values parameters. It’s important to note that the lists must be the same length.

In the example below, we’ll replace London with England and Paris with France:

df['Birth City'] = df['Birth City'].replace(
    to_replace=['London', 'Paris'], 
    value=['England', 'France'])

print(df)

This returns the following dataframe:

      Name  Age Birth City Gender
0     Jane   23    England      F
1  Melissa   45     France      F
2     John   35    Toronto      M
3     Matt   64    Atlanta      M

Replace Values in the Entire Dataframe

In the previous examples, you learned how to replace values in a single column. Similar to those examples, we can easily replace values in the entire dataframe.

Let’s take a look at replacing the letter F with P in the entire dataframe:

df = df.replace(
    to_replace='M', 
    value='P')

print(df)

This returns the following dataframe:

      Name  Age Birth City Gender
0     Jane   23     London      F
1  Melissa   45      Paris      F
2     John   35    Toronto      P
3     Matt   64    Atlanta      P

We can see that this didn’t return the expected results.

In order to replace substrings (such as in Melissa), we simply pass in regex=True:

df = df.replace(
    to_replace='M', 
    value='P',
    regex=True)

print(df)

This returns the expected dataframe:

      Name  Age Birth City Gender
0     Jane   23     London      F
1  Pelissa   45      Paris      F
2     John   35    Toronto      P
3     Patt   64    Atlanta      P

Finally, let’s take a closer look at more complex regular expression replacements.

Replacing Values with Regex (Regular Expressions)

We can use regular expressions to make complex replacements.

We’ll cover off a fairly simple example, where we replace any four letter word in the Name column with “Four letter name”.

The following .replace() method call does just that:

df = df.replace(
    to_replace=r'\b\w{4}\b', 
    value='Four letter name',
    regex=True)

print(df)

This returns the following dataframe:

               Name  Age Birth City Gender
0  Four letter name   23     London      F
1           Melissa   45      Paris      F
2  Four letter name   35    Toronto      M
3  Four letter name   64    Atlanta      M

Replace Values In Place with Pandas

We can also replace values inplace, rather than having to re-assign them. This is done simply by setting inplace= to True.

Let’s re-visit an earlier example:

df['Birth City'].replace(
    to_replace='Paris', 
    value='France',
    inplace=True)

print(df)

This returns the following dataframe:

      Name  Age Birth City Gender
0     Jane   23     London      F
1  Melissa   45     France      F
2     John   35    Toronto      M
3     Matt   64    Atlanta      M

Conclusion

In this post, you learned how to use the Pandas replace method to, well, replace values in a Pandas dataframe. The .replace() method is extremely powerful and lets you replace values across a single column, multiple columns, and an entire dataframe. The method also incorporates regular expressions to make complex replacements easier.

To learn more about the Pandas .replace() method, check out the official documentation here.