# Pandas Sum: Add Dataframe Columns and Rows In this tutorial, you’ll learn how use Pandas to calculate a sum, including how to add up dataframe columns and rows. Being able to add up values to calculate either column totals or row totals allows you to generate some helpful summary statistics.

By the end of this tutorial, you’ll have learned how to:

• Calculate the Sum of a Pandas Dataframe Column
• Calculate the Sum of a Pandas Dataframe Row
• Add Pandas Dataframe Columns Together
• Add Pandas Dataframe Columns That Meet a Condition
• Calculate the Sum of a Pandas GroupBy Dataframe

If you want to follow along with the tutorial line by line, copy the code below. This code loads a sample Pandas Dataframe that we’ll reference throughout the tutorial. If you have your own data, feel free to use that follow along, but your results will vary.

``````# Loading a Sample Pandas Dataframe
import pandas as pd

df = pd.DataFrame.from_dict({
'Name': ['Nik', 'Kate', 'Kevin', 'Evan', 'Jane', 'Kyra', 'Melissa'],
'January_Sales': [90, 95, 75, 93, 60, 85, 75],
'February_Sales': [95, 95, 75, 65, 50, 85, 100],
'March_Sales': [100, 95, 50, 75, 90, 50, 80]
})

# Returns:
#     Name  January_Sales  February_Sales  March_Sales
# 0    Nik             90              95          100
# 1   Kate             95              95           95
# 2  Kevin             75              75           50
# 3   Evan             93              65           75
# 4   Jane             60              50           90``````

We can see that we have four columns: 1 that contains the name of a salesperson and three that contain the sales values of each salesperson.

In the next section, you’ll learn how to use Pandas to add up all the values in a dataframe column.

## Calculate the Sum of a Pandas Dataframe Column

A common task you may need to do is add up all the values in a Pandas Dataframe column. Thankfully, Pandas makes this very easy with the `sum` method. We can apply this method to either a Pandas series (meaning, a column) or an entire dataframe.

Let’s start by learning how to how to add up all the values in a Pandas column:

``````# Add all values in a Pandas column
january_sum = df['January_Sales'].sum()
print(january_sum)

# Returns: 572``````

Similarly, we can calculate the sum of all columns in a Pandas Dataframe. We can do this by simply applying the `sum` method on the entire dataframe.

Let’s give this a shot:

``````# Calculate the sum of all columns in a Pandas Dataframe
dataframe_sum = df.sum()
print(dataframe_sum)

# Returns:
# Name              NikKateKevinEvanJaneKyraMelissa
# January_Sales                                 573
# February_Sales                                565
# March_Sales                                   540``````

This returns a Pandas series that easily query, if we wanted to return the sum of a particular column. By default, Pandas will only add up numeric columns, meaning that we don’t add up our Name column.

In the next section, you’ll learn how to calculate the sum of a Pandas Dataframe row.

## Calculate the Sum of a Pandas Dataframe Row

In many cases, you’ll want to add up values across rows in a Pandas Dataframe. Similar to the example above, we can make use of the `.sum` method. By default, Pandas will apply an `axis=0` argument, which will add up values index-wise. If we can change this to `axis=1`, values will be added column-wise.

Let’s see how we can add up values across rows in Pandas:

``````# Adding up dataframe rows using .sum()
dataframe_sum = df.sum(axis=1, numeric_only=True)
print(dataframe_sum)

# Returns:
# 0    285
# 1    285
# 2    200
# 3    233
# 4    200
# 5    220
# 6    255
# dtype: int64``````

You may be wondering why we apply the `numeric_only=True` argument here. In future versions of Pandas, a `TypeError` will be thrown if non-numeric columns are included.

What if we wanted to assign an index to make the rows easier to tell apart? We can do this using the Pandas `set_index` method. Let’s see what this looks like:

``````# Setting an index and adding up all dataframe rows
dataframe_sum = df.set_index('Name').sum(axis=1, numeric_only=True)
print(dataframe_sum)

# Returns:
# Name
# Nik        285
# Kate       285
# Kevin      200
# Evan       233
# Jane       200
# Kyra       220
# Melissa    255
# dtype: int64``````

This is much cleaner result that allows us to better see the row’s identifier, which in this case is the name of the salesperson.

In the next section, you’ll learn how to just add some columns of a Pandas Dataframe together.

## Add Pandas Dataframe Columns Together

Pandas makes it easy to add different columns together, selectively. Say we only wanted to add two columns together row-wise, rather than all of them, we can simply add the columns directly. The benefit of this approach is that we can assign a new column that stores these values.

Let’s see what this looks like:

``````# Add two columns in a Pandas dataframe
df['Jan_Feb_Sum'] = df['January_Sales'] + df['February_Sales']

# Returns:
#     Name  January_Sales  February_Sales  March_Sales  Jan_Feb_Sum
# 0    Nik             90              95          100          185
# 1   Kate             95              95           95          190
# 2  Kevin             75              75           50          150
# 3   Evan             93              65           75          158
# 4   Jane             60              50           90          110``````

We can see that the we’ve created a new column that stores the sum of two of our columns. A great thing about this operation is that its vectorized, meaning that its very fast and takes advantage of the power of Pandas.

In the next section, you’ll learn how to add dataframe columns conditionally.

## Add Pandas Dataframe Columns That Meet a Condition

There may be times when you want to add multiple columns in a dataframe, but not all of them. We can do this by adding Pandas columns conditionally, with the help of a list comprehension.

For this example, let’s modify our dataframe to include an additional numerical column:

``````# Loading a modify Pandas Dataframe
import pandas as pd

df = pd.DataFrame.from_dict({
'Name': ['Nik', 'Kate', 'Kevin', 'Evan', 'Jane', 'Kyra', 'Melissa'],
'January_Sales': [90, 95, 75, 93, 60, 85, 75],
'February_Sales': [95, 95, 75, 65, 50, 85, 100],
'March_Sales': [100, 95, 50, 75, 90, 50, 80],
'Some Random Number': [1,2,3,4,5,6,7]
})

# Returns:
#     Name  January_Sales  February_Sales  March_Sales  Some Random Number
# 0    Nik             90              95          100                   1
# 1   Kate             95              95           95                   2
# 2  Kevin             75              75           50                   3
# 3   Evan             93              65           75                   4
# 4   Jane             60              50           90                   5``````

Now, when we’re adding the values of our rows, it may not make sense to include the last column. Say, we only wanted to include the columns that include the word Sales. What we can do is create a list comprehension that checks if the word Sales exists in a column or not.

Let’s see how we can do this:

``````sales_columns = [col for col in df.columns if 'Sales' in col]
print(sales_columns)

# Returns: ['January_Sales', 'February_Sales', 'March_Sales']``````

You can learn more about how to do this in this tutorial, by learning how to iterate over columns and checking for a condition. In order to do this, we’ll first use Pandas to get our DataFrame’s columns as a list.

Now that we have our columns selected, we can use the `axis=1` argument and add up only the columns that contain Sales. Let’s see what this looks like:

``````# Adding columns conditionally in Pandas
sales_columns = [col for col in df.columns if 'Sales' in col]

df['Total Sales'] = df[sales_columns].sum(axis=1)

# Returns:
#     Name  January_Sales  February_Sales  March_Sales  Some Random Number  Total Sales
# 0    Nik             90              95          100                   1          285
# 1   Kate             95              95           95                   2          285
# 2  Kevin             75              75           50                   3          200
# 3   Evan             93              65           75                   4          233
# 4   Jane             60              50           90                   5          200``````

This way, we can safely add up values across columns row-wise without adding in columns that we don’t want to include.

In the next section, you’ll learn how to calculate the sum of a Pandas Dataframe when data are grouped using `groupby`.

## Calculate the Sum of a Pandas GroupBy Dataframe

In this final section, you’ll learn how to calculate the sum of a Pandas Dataframe when grouping data using the `groupby` method. For this, we’ll modify our dataframe to include a column to a salesperson’s gender. This can allow us to group the data by gender and calculating totals by gender.

``````# Loading a Sample Pandas Dataframe
import pandas as pd

df = pd.DataFrame.from_dict({
'Name': ['Nik', 'Kate', 'Kevin', 'Evan', 'Jane', 'Kyra', 'Melissa'],
'Gender': ['Male', 'Female', 'Male', 'Male', 'Female', 'Female', 'Female'],
'January_Sales': [90, 95, 75, 93, 60, 85, 75],
'February_Sales': [95, 95, 75, 65, 50, 85, 100],
'March_Sales': [100, 95, 50, 75, 90, 50, 80],
})

# Returns:
#     Name  Gender  January_Sales  February_Sales  March_Sales
# 0    Nik    Male             90              95          100
# 1   Kate  Female             95              95           95
# 2  Kevin    Male             75              75           50
# 3   Evan    Male             93              65           75
# 4   Jane  Female             60              50           90``````

We can now group our data using the group by method, in order to group it by gender. To learn more about how to group data with the `groupby` method, check out my video here:

Let’s group our data and add all the numeric columns:

``````grouped = df.groupby('Gender').sum()
print(grouped)

# Returns:
#         January_Sales  February_Sales  March_Sales
# Gender
# Female            315             330          315
# Male              258             235          225``````

We can see that first grouping our data by `Gender` and then adding the values in the dataframe returns a column-wise sum based on the groupings of Gender.

## Conclusion

In this tutorial, you learned how to use the Pandas sum method to calculate sums across dataframes. You learned how to add values row-wise and column-wise. You also learned how add columns conditionally and how to add values in a grouped Pandas Dataframe.

To learn more about the Pandas `sum` function, check out the official documentation here.