Pandas Sum: Add Dataframe Columns and Rows

Pandas Sum Add Values in Dataframe - Cover Image

In this tutorial, you’ll learn how use Pandas to calculate a sum, including how to add up dataframe columns and rows. Being able to add up values to calculate either column totals or row totals allows you to generate some helpful summary statistics.

By the end of this tutorial, you’ll have learned how to:

  • Calculate the Sum of a Pandas Dataframe Column
  • Calculate the Sum of a Pandas Dataframe Row
  • Add Pandas Dataframe Columns Together
  • Add Pandas Dataframe Columns That Meet a Condition
  • Calculate the Sum of a Pandas GroupBy Dataframe

Quick Answer: Use Pandas .sum() To Add Dataframe Columns and Rows

How to add valuesDescriptionExample
Column-wiseAdd all numeric values in a Pandas column or a dataframe’s columnsdf['column name'].sum()
Row-wiseAdd all numeric values in a Pandas rowdf.sum(axis=1)
Specific ColumnsAdd values of specific columnsdf['column 1'] + df['column 2']
How to use the Pandas sum method to add values in different ways

Loading a Sample Pandas Dataframe

If you want to follow along with the tutorial line by line, copy the code below. This code loads a sample Pandas Dataframe that we’ll reference throughout the tutorial. If you have your own data, feel free to use that follow along, but your results will vary.

# Loading a Sample Pandas Dataframe
import pandas as pd

df = pd.DataFrame.from_dict({
    'Name': ['Nik', 'Kate', 'Kevin', 'Evan', 'Jane', 'Kyra', 'Melissa'],
    'January_Sales': [90, 95, 75, 93, 60, 85, 75],
    'February_Sales': [95, 95, 75, 65, 50, 85, 100],
    'March_Sales': [100, 95, 50, 75, 90, 50, 80]
})

print(df.head())

# Returns:
#     Name  January_Sales  February_Sales  March_Sales
# 0    Nik             90              95          100
# 1   Kate             95              95           95
# 2  Kevin             75              75           50
# 3   Evan             93              65           75
# 4   Jane             60              50           90

We can see that we have four columns: 1 that contains the name of a salesperson and three that contain the sales values of each salesperson.

In the next section, you’ll learn how to use Pandas to add up all the values in a dataframe column.

Calculate the Sum of a Pandas Dataframe Column

A common task you may need to do is add up all the values in a Pandas Dataframe column. Thankfully, Pandas makes this very easy with the sum method. We can apply this method to either a Pandas series (meaning, a column) or an entire dataframe.

Let’s start by learning how to how to add up all the values in a Pandas column:

# Add all values in a Pandas column
january_sum = df['January_Sales'].sum()
print(january_sum)

# Returns: 572

Similarly, we can calculate the sum of all columns in a Pandas Dataframe. We can do this by simply applying the sum method on the entire dataframe.

Let’s give this a shot:

# Calculate the sum of all columns in a Pandas Dataframe
dataframe_sum = df.sum()
print(dataframe_sum)

# Returns: 
# Name              NikKateKevinEvanJaneKyraMelissa
# January_Sales                                 573
# February_Sales                                565
# March_Sales                                   540

This returns a Pandas series that easily query, if we wanted to return the sum of a particular column. By default, Pandas will only add up numeric columns, meaning that we don’t add up our Name column.

In the next section, you’ll learn how to calculate the sum of a Pandas Dataframe row.

Calculate the Sum of a Pandas Dataframe Row

In many cases, you’ll want to add up values across rows in a Pandas Dataframe. Similar to the example above, we can make use of the .sum method. By default, Pandas will apply an axis=0 argument, which will add up values index-wise. If we can change this to axis=1, values will be added column-wise.

Let’s see how we can add up values across rows in Pandas:

# Adding up dataframe rows using .sum()
dataframe_sum = df.sum(axis=1, numeric_only=True)
print(dataframe_sum)

# Returns: 
# 0    285
# 1    285
# 2    200
# 3    233
# 4    200
# 5    220
# 6    255
# dtype: int64

You may be wondering why we apply the numeric_only=True argument here. In future versions of Pandas, a TypeError will be thrown if non-numeric columns are included.

What if we wanted to assign an index to make the rows easier to tell apart? We can do this using the Pandas set_index method. Let’s see what this looks like:

# Setting an index and adding up all dataframe rows
dataframe_sum = df.set_index('Name').sum(axis=1, numeric_only=True)
print(dataframe_sum)

# Returns: 
# Name
# Nik        285
# Kate       285
# Kevin      200
# Evan       233
# Jane       200
# Kyra       220
# Melissa    255
# dtype: int64

This is much cleaner result that allows us to better see the row’s identifier, which in this case is the name of the salesperson.

In the next section, you’ll learn how to just add some columns of a Pandas Dataframe together.

Add Pandas Dataframe Columns Together

Pandas makes it easy to add different columns together, selectively. Say we only wanted to add two columns together row-wise, rather than all of them, we can simply add the columns directly. The benefit of this approach is that we can assign a new column that stores these values.

Let’s see what this looks like:

# Add two columns in a Pandas dataframe
df['Jan_Feb_Sum'] = df['January_Sales'] + df['February_Sales']
print(df.head())

# Returns: 
#     Name  January_Sales  February_Sales  March_Sales  Jan_Feb_Sum
# 0    Nik             90              95          100          185
# 1   Kate             95              95           95          190
# 2  Kevin             75              75           50          150
# 3   Evan             93              65           75          158
# 4   Jane             60              50           90          110

We can see that the we’ve created a new column that stores the sum of two of our columns. A great thing about this operation is that its vectorized, meaning that its very fast and takes advantage of the power of Pandas.

In the next section, you’ll learn how to add dataframe columns conditionally.

Add Pandas Dataframe Columns That Meet a Condition

There may be times when you want to add multiple columns in a dataframe, but not all of them. We can do this by adding Pandas columns conditionally, with the help of a list comprehension.

For this example, let’s modify our dataframe to include an additional numerical column:

# Loading a modify Pandas Dataframe
# Loading a Sample Pandas Dataframe
import pandas as pd

df = pd.DataFrame.from_dict({
    'Name': ['Nik', 'Kate', 'Kevin', 'Evan', 'Jane', 'Kyra', 'Melissa'],
    'January_Sales': [90, 95, 75, 93, 60, 85, 75],
    'February_Sales': [95, 95, 75, 65, 50, 85, 100],
    'March_Sales': [100, 95, 50, 75, 90, 50, 80],
    'Some Random Number': [1,2,3,4,5,6,7]
})

print(df.head())

# Returns: 
#     Name  January_Sales  February_Sales  March_Sales  Some Random Number
# 0    Nik             90              95          100                   1
# 1   Kate             95              95           95                   2
# 2  Kevin             75              75           50                   3
# 3   Evan             93              65           75                   4
# 4   Jane             60              50           90                   5

Now, when we’re adding the values of our rows, it may not make sense to include the last column. Say, we only wanted to include the columns that include the word Sales. What we can do is create a list comprehension that checks if the word Sales exists in a column or not.

Let’s see how we can do this:

sales_columns = [col for col in df.columns if 'Sales' in col]
print(sales_columns)

# Returns: ['January_Sales', 'February_Sales', 'March_Sales']

You can learn more about how to do this in this tutorial, by learning how to iterate over columns and checking for a condition. In order to do this, we’ll first use Pandas to get our dataframe’s columns as a list.

Now that we have our columns selected, we can use the axis=1 argument and add up only the columns that contain Sales. Let’s see what this looks like:

# Adding columns conditionally in Pandas
sales_columns = [col for col in df.columns if 'Sales' in col]

df['Total Sales'] = df[sales_columns].sum(axis=1)
print(df.head())

# Returns: 
#     Name  January_Sales  February_Sales  March_Sales  Some Random Number  Total Sales
# 0    Nik             90              95          100                   1          285
# 1   Kate             95              95           95                   2          285
# 2  Kevin             75              75           50                   3          200
# 3   Evan             93              65           75                   4          233
# 4   Jane             60              50           90                   5          200

This way, we can safely add up values across columns row-wise without adding in columns that we don’t want to include.

In the next section, you’ll learn how to calculate the sum of a Pandas Dataframe when data are grouped using groupby.

Calculate the Sum of a Pandas GroupBy Dataframe

In this final section, you’ll learn how to calculate the sum of a Pandas Dataframe when grouping data using the groupby method. For this, we’ll modify our dataframe to include a column to a salesperson’s gender. This can allow us to group the data by gender and calculating totals by gender.

# Loading a Sample Pandas Dataframe
import pandas as pd

df = pd.DataFrame.from_dict({
    'Name': ['Nik', 'Kate', 'Kevin', 'Evan', 'Jane', 'Kyra', 'Melissa'],
    'Gender': ['Male', 'Female', 'Male', 'Male', 'Female', 'Female', 'Female'],
    'January_Sales': [90, 95, 75, 93, 60, 85, 75],
    'February_Sales': [95, 95, 75, 65, 50, 85, 100],
    'March_Sales': [100, 95, 50, 75, 90, 50, 80],
})

print(df.head())

# Returns: 
#     Name  Gender  January_Sales  February_Sales  March_Sales
# 0    Nik    Male             90              95          100
# 1   Kate  Female             95              95           95
# 2  Kevin    Male             75              75           50
# 3   Evan    Male             93              65           75
# 4   Jane  Female             60              50           90

We can now group our data using the group by method, in order to group it by gender. To learn more about how to group data with the groupby method, check out my video here:

How to use the Pandas groupby method

Let’s group our data and add all the numeric columns:

grouped = df.groupby('Gender').sum()
print(grouped)

# Returns: 
#         January_Sales  February_Sales  March_Sales
# Gender                                            
# Female            315             330          315
# Male              258             235          225

We can see that first grouping our data by Gender and then adding the values in the dataframe returns a column-wise sum based on the groupings of Gender.

Conclusion

In this tutorial, you learned how to use the Pandas sum method to calculate sums across dataframes. You learned how to add values row-wise and column-wise. You also learned how add columns conditionally and how to add values in a grouped Pandas Dataframe.

To learn more about the Pandas sum function, check out the official documentation here.

Additional Resources

Check out the links below to learn about related topics: