Skip to content

Pandas: Replace NaN with Zeroes

Pandas Replace NaN with Zeroes Cover Image

Working with missing data is an essential skill for any data analyst or data scientist! In many cases, you’ll want to replace your missing data, or NaN values, with zeroes. In this tutorial, you’ll learn how to use Pandas to replace NaN values with zeroes. This is a common skill that is part of better cleaning and transforming your data.

By the end of this tutorial, you’ll have learned:

  • How to use Pandas to replace NaN values with zeroes for a single column, multiple columns, and an entire DataFrame
  • How to use NumPy to replace NaN values in a Pandas DataFrame
  • How to replace NaN values in a Pandas DataFrame in-place

Loading a Sample Pandas DataFrame

To follow along with the tutorial, I have provided a sample Pandas DataFrame. To load the DataFrame, we’ll import Pandas using the alias pd and pass a dictionary into the DataFrame() constructor. Since we’ll want to include some NaN values as well, we’ll import NumPy as well.

# Loading a Sample Pandas DataFrame
import pandas as pd
import numpy as np

df = pd.DataFrame({
    'Col_A': [1, 2, 3, np.NaN],
    'Col_B': [1, np.NaN, 3, 4],
    'Col_C': [1, 2, np.NaN, 4],
})

print(df)

# Returns:
#    Col_A  Col_B  Col_C
# 0    1.0    1.0    1.0
# 1    2.0    NaN    2.0
# 2    3.0    3.0    NaN
# 3    NaN    4.0    4.0

We can see that we have three columns, each of which contains missing data.

How to Replace NaN Values with Zeroes for a Single Pandas Column

In order to replace all missing values with zeroes in a single column of a Pandas DataFrame, we can apply the fillna method to the column. The function allows you to pass in a value with which to replace missing data. In this case, we pass in the value of 0.

# Replace NaN Values with Zeroes for a Single Pandas Column
import pandas as pd
import numpy as np

df = pd.DataFrame({'Col_A': [1, 2, 3, np.NaN], 'Col_B': [1, np.NaN, 3, 4], 'Col_C': [1, 2, np.NaN, 4]})

df['Col_A'] = df['Col_A'].fillna(0)

print(df)

# Returns:
#    Col_A  Col_B  Col_C
# 0    1.0    1.0    1.0
# 1    2.0    NaN    2.0
# 2    3.0    3.0    NaN
# 3    0.0    4.0    4.0

In the code above, we reassign the column 'Col_A' to itself. In reassigning it, we apply the .fillna() method, passing 0 into the argument. In the following section, you’ll learn how to replace all missing values for multiple columns.

How to Replace NaN Values with Zeroes for Multiple Pandas Columns

In order to replace NaN values with zeroes for multiple columns in a Pandas DataFrame, we can apply the fillna method to multiple columns. In order to modify multiple columns, we can pass a list of column labels into the selector. Let’s see what this looks like:

# Replace NaN Values with Zeroes for Two Pandas Columns
import pandas as pd
import numpy as np

df = pd.DataFrame({'Col_A': [1, 2, 3, np.NaN], 'Col_B': [1, np.NaN, 3, 4], 'Col_C': [1, 2, np.NaN, 4]})

df[['Col_A', 'Col_B']] = df[['Col_A', 'Col_B']].fillna(0)

print(df)

# Returns:
#    Col_A  Col_B  Col_C
# 0    1.0    1.0    1.0
# 1    2.0    0.0    2.0
# 2    3.0    3.0    NaN
# 3    0.0    4.0    4.0

In the code above, we select multiple columns by passing in a list of column labels into the df[] selector. We can then apply the fillna method passing in 0. This replaces all missing values with 0 for multiple columns.

How to Replace NaN Values with Zeroes for a Pandas DataFrame

The Pandas fillna method can also be applied to an entire DataFrame. In this case, any column’s missing NaN values will be filled with the value that’s passed into the method. This can be a helpful approach when you’re dealing with DataFrames where you want consistency in how missing values are filled.

# Replace NaN Values with Zeroes for an Entire DataFrame
import pandas as pd
import numpy as np

df = pd.DataFrame({'Col_A': [1, 2, 3, np.NaN], 'Col_B': [1, np.NaN, 3, 4], 'Col_C': [1, 2, np.NaN, 4]})

df = df.fillna(0)

print(df)

# Returns:
#    Col_A  Col_B  Col_C
# 0    1.0    1.0    1.0
# 1    2.0    0.0    2.0
# 2    3.0    3.0    0.0
# 3    0.0    4.0    4.0

In the code block above, we re-assign the DataFrame to itself, applying the fillna method. We pass in the value of 0 in order to replace all missing values with zeroes.

How to Replace NaN Values with Zeroes for a Pandas DataFrame In Place

Similarly, we can replace all NaN values in a Pandas DataFrame in place. This allows us to not have to re-assign the DataFrame to itself. It also makes the code more efficient, since Pandas won’t have to create a new object.

# Replace NaN Values with Zeroes for a DataFrame In Place
import pandas as pd
import numpy as np

df = pd.DataFrame({'Col_A': [1, 2, 3, np.NaN], 'Col_B': [1, np.NaN, 3, 4], 'Col_C': [1, 2, np.NaN, 4]})

df.fillna(0, inplace=True)

print(df)

# Returns:
#    Col_A  Col_B  Col_C
# 0    1.0    1.0    1.0
# 1    2.0    0.0    2.0
# 2    3.0    3.0    0.0
# 3    0.0    4.0    4.0

In the code above, we simply pass in inplace=True as a second argument. This modifies the DataFrame directly, replacing all missing values.

How to Replace NaN Values with Zeroes in Pandas Using NumPy For a Column

Because of how closely Pandas is tied to NumPy, we can also use NumPy methods on a Pandas DataFrame. We can apply the .replace() method directly to a Pandas Series (or, rather, column). The .replace() method takes two parameters:

  1. The value to replace
  2. The value to replace with

Let’s see what this looks like:

# Replace NaN Values with Zeroes for a Single Pandas Column with NumPy
import pandas as pd
import numpy as np

df = pd.DataFrame({'Col_A': [1, 2, 3, np.NaN], 'Col_B': [1, np.NaN, 3, 4], 'Col_C': [1, 2, np.NaN, 4]})

df['Col_A'] = df['Col_A'].replace(np.NaN, 0)
print(df)

# Returns:
#    Col_A  Col_B  Col_C
# 0    1.0    1.0    1.0
# 1    2.0    NaN    2.0
# 2    3.0    3.0    NaN
# 3    0.0    4.0    4.0

In the code above, we use the np.replace() method to replace all missing NaN values with the value 0.

How to Replace NaN Values with Zeroes in Pandas Using NumPy For a DataFrame

Similarly, we can use the NumPy .replace() method to replace NaN values with zeroes across an entire Pandas DataFrame. In order to accomplish this, we can simply apply the .replace() method to an entire DataFrame, as shown below:

# Replace NaN Values with Zeroes for a DataFrame with NumPy
import pandas as pd
import numpy as np

df = pd.DataFrame({'Col_A': [1, 2, 3, np.NaN], 'Col_B': [1, np.NaN, 3, 4], 'Col_C': [1, 2, np.NaN, 4]})

df = df.replace(np.NaN, 0)

print(df)

# Returns:
#    Col_A  Col_B  Col_C
# 0    1.0    1.0    1.0
# 1    2.0    0.0    2.0
# 2    3.0    3.0    0.0
# 3    0.0    4.0    4.0

In the code above, we apply the .replace() to the entire DataFrame to replace missing values.

Conclusion

In this tutorial, you learned how to use Pandas to replace NaN values with zeroes. You learned how to do this for a single column, multiple columns, and an entire DataFrame using the Pandas fillna method. Then, you learned how to use the NumPy .replace() method to do the same for a single column and an entire DataFrame.

Additional Resources

To learn more about related topics, check out the tutorials below:

Nik Piepenbreier

Nik is the author of datagy.io and has over a decade of experience working with data analytics, data science, and Python. He specializes in teaching developers how to use Python for data science using hands-on tutorials.View Author posts

Leave a Reply

Your email address will not be published. Required fields are marked *