In this tutorial, you’ll learn how to use Pandas to count the number of columns in a Pandas Dataframe. You’ll learn a number of different ways to accomplish this, including using the df.columns
attribute and the df.shape
attribute. Knowing how to get the number of columns in a Pandas Dataframe is an important skill. Because these methods do different things and return different types of values, knowing which method returns the type of result you want is helpful in ensuring your program will behave the way you want it to.
The Quick Answer: Use len(df.columns)
Table of Contents
Loading a Sample Pandas Dataframe
Let’s start this tutorial off by loading a sample dataframe that you can follow along with. If you’re working with your own dataframe, you’ll likely encounter different results.
Let’s load our Pandas Dataframe:
# Loading a Sample Pandas Dataframe
from seaborn import load_dataset
df = load_dataset('penguins')
print(df.head())
# Returns
# species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex
# 0 Adelie Torgersen 39.1 18.7 181.0 3750.0 Male
# 1 Adelie Torgersen 39.5 17.4 186.0 3800.0 Female
# 2 Adelie Torgersen 40.3 18.0 195.0 3250.0 Female
# 3 Adelie Torgersen NaN NaN NaN NaN NaN
# 4 Adelie Torgersen 36.7 19.3 193.0 3450.0 Female
In the code above, we loaded a Pandas Dataframe using the load_dataset()
function included in the Seaborn package. If you don’t have the Seaborn module installed, you can install it using the following command:
pip install seaborn
To learn more about the Seaborn library, check out my in-depth tutorial here, which guides you through using the popular data visualization library.
Now that we have a dataframe, let’s get started in learning how to count the number of columns in a Pandas Dataframe using the .columns
attribute.
Want to learn more about Python f-strings? Check out my in-depth tutorial, which includes a step-by-step video to master Python f-strings!
Get Number of Pandas Dataframe Columns with .columns
In this section, you’ll learn a simple and versatile approach of counting the number of columns in a Pandas Dataframe. Because Pandas Dataframes are Python objects, we can access different attributes belonging to them. One of these attributes is the .columns
attributes, which returns a list-like structure containing all of the different columns in that dataframe.
Because we can count the number of items in a list using the len()
function, we can pass the attribute into the function to count the number of columns in our dataframe.
Let’s see how this works:
# Count Columns in a Pandas Dataframe Using .columns
from seaborn import load_dataset
df = load_dataset('penguins')
num_columns = len(df.columns)
print(num_columns)
# Returns: 7
We can see from the example above that when we pass the .columns
attribute into the len()
function that this returns the length of the dataframe columns list.
In the next section, you’ll learn how to count the number of Pandas Dataframe columns matching a certain condition.
Count Number of Pandas Dataframe Columns Matching a Condition
There may be times when you want to count the number of columns in a Pandas Dataframe matching a condition. For example, you may want to know how many columns contain a given suffix or contain numbers.
We can do this using the .columns
dataframe attribute. We can use a list comprehension to filter the items to only provide items that match a condition. To learn more about Python list comprehensions, check out my in-depth tutorial here, which also includes a video overview.
We can loop over the list-like structure returned and keep only items that match a condition. In the example below, you’ll learn to count the number of columns that have the ‘_mm’ suffix.
# Count Columns in a Pandas Dataframe Using .columns Conditionally
from seaborn import load_dataset
df = load_dataset('penguins')
conditional_columns = [col for col in df.columns if '_mm' in col]
num_columns = len(conditional_columns)
print(num_columns)
# Returns: 3
We can see here that our we check if the string '_mm'
exists in our column name. If it doesn’t, then we don’t include in our list of columns. Finally, we count the number of columns that meet our condition, which returns the value 3.
In the next section, you’ll learn how to use the Pandas .shape
attribute to count the number of columns in a Pandas Dataframe.
Check out some other Python tutorials on datagy, including our complete guide to styling Pandas and our comprehensive overview of Pivot Tables in Pandas!
Get Number of Pandas Dataframe Columns with .shape
The Pandas .shape
attribute is a helpful dataframe attribute that allows us to see the number of rows and columns in a Pandas Dataframe. The shape attribute returns a tuple of values, where the first value is the number of rows in a dataframe and the second value is the number of columns in the dataframe.
We can get the number of columns in our dataframe by using the .shape
attribute and accessing the second item.
Before we do this, let’s see what the .shape
Pandas Dataframe attribute returns:
# Count Columns in a Pandas Dataframe Using .shape
from seaborn import load_dataset
df = load_dataset('penguins')
print(df.shape)
# Returns: (344, 7)
We can see that the attribute returns a tuple, matching the following items (number of rows, number of columns)
. We can access the second item by accessing index position 1.
Let’s see how we can do this in Pandas:
# Count Columns in a Pandas Dataframe Using .shape
from seaborn import load_dataset
df = load_dataset('penguins')
num_columns = df.shape[1]
print(num_columns)
# Returns: 7
We can see here that when we access the first item of the data that’s returned, that we get the number of columns in a Pandas Dataframe. In the next section, you’ll learn how to use the Pandas .info()
method to get the number of columns in a dataframe.
Get Number of Pandas Dataframe Columns with .info
In this final section, you’ll learn how to use the .info()
dataframe method to get the number of columns in a dataframe. This method works a little different from the attribute you’ve seen in the previous sections of the tutorial. This is because it returns information about the dataframe in a string format.
Let’s see what this looks like in Pandas:
# Count Columns in a Pandas Dataframe Using .info()
from seaborn import load_dataset
df = load_dataset('penguins')
print(df.info())
# Returns:
# <class 'pandas.core.frame.DataFrame'>
# RangeIndex: 344 entries, 0 to 343
# Data columns (total 7 columns):
# # Column Non-Null Count Dtype
# --- ------ -------------- -----
# 0 species 344 non-null object
# 1 island 344 non-null object
# 2 bill_length_mm 342 non-null float64
# 3 bill_depth_mm 342 non-null float64
# 4 flipper_length_mm 342 non-null float64
# 5 body_mass_g 342 non-null float64
# 6 sex 333 non-null object
# dtypes: float64(4), object(3)
# memory usage: 18.9+ KB
# None
We can see here that while we get a lot of helpful information about the dataframe itself, including the number of columns, the information isn’t easily accessible programmatically. For example, we can’t access the number of items without actually reading the information ourselves.
Conclusion
In this tutorial, you learned how to use Python and Pandas to count the number of columns in a dataframe. You learned how to do this using the .columns
attribute, the .shape
attribute, and the .info
attribute. You learned how these methods work and the types of results they return. You also learned how to count the number of columns meeting a condition, such as containing a substring.
To learn more about the Pandas .columns
attribute, check out the official documentation here.