Get Pandas Column Names as a List

Get Pandas Column Names Cover Image

In this post, you’ll learn how to get Pandas columns names as list. You’ll learn a number of different methods including how to return a list of column names and a list that’s sorted alphabetically. You’ll learn which of the methods is fastest, even if it’s not the fastest to write. Finally, you’ll learn to check if a column exists in a dataframe.

Loading a Sample Dataframe

To follow along with this tutorial, load the sample dataframe provided below by copying this code into your favourite text editor! We use the Seaborn .load_dataset() function to load a built-in dataset.

If you don’t yet have Seaborn installed, you can install it using pip install seaborn in your terminal. To learn more about Seaborn, check out my tutorial series here.

import pandas as pd
from seaborn import load_dataset

df = load_dataset('penguins')

print(df.head())

This returns the following dataframe:

  species     island  bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g     sex
0  Adelie  Torgersen            39.1           18.7              181.0       3750.0    Male
1  Adelie  Torgersen            39.5           17.4              186.0       3800.0  Female
2  Adelie  Torgersen            40.3           18.0              195.0       3250.0  Female
3  Adelie  Torgersen             NaN            NaN                NaN          NaN     NaN
4  Adelie  Torgersen            36.7           19.3              193.0       3450.0  Female

Get Pandas Column Names

Pandas provides a very helpful attribute, the .columns attribute, to access column names. By default, this returns an object of type Index, which isn’t immediately iterable.

Before we dive any further, let’s take a look at what the .columns attribute returns:

print(df.columns)

This returns:

Index(['species', 'island', 'bill_length_mm', 'bill_depth_mm',
       'flipper_length_mm', 'body_mass_g', 'sex'],
      dtype='object')

Get Pandas Column Names as a List

To get a list of Pandas column names, we can simply turn our Index object into a list by using Python’s list() function.

Let’s take a look at how this works:

>>> print(list(df.columns))
['species', 'island', 'bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g', 'sex']

Similarly, you could use the Pandas .tolist() method, which works as below:

>>> print(df.columns.tolist())
['species', 'island', 'bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g', 'sex']

Alternatively, you could also use a much less efficient (though, really only noticeable for much larger dataframes) method:

>>> print(list(df))
['species', 'island', 'bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g', 'sex']

Get Pandas Column Names as a Sorted List Alphabetically

Now that we have a list of dataframe column names, we can sort this list alphabetically. To accomplish this, we can use the Python sorted() function.

By default, Python will sort a list with capitalization in mind. If you don’t want this to happen, you can use the key=str.lower attribute.

Let’s see how this looks:

>>> print(sorted(list(df.columns)))
['bill_depth_mm', 'bill_length_mm', 'body_mass_g', 'flipper_length_mm', 'island', 'sex', 'species']

If the capitalization is causing you issues, simply write the following:

>>> print(sorted(list(df.columns), key=str.lower))
['bill_depth_mm', 'bill_length_mm', 'body_mass_g', 'flipper_length_mm', 'island', 'sex', 'species']

Check if a Column Exists in a Pandas Dataframe

To see if a column exists in a Pandas dataframe, we can use the Python in operator. This returns a boolean, specifically a True value if an item exists in the list.

Now, let’s see if the column species exists in our dataframe:

>>> print('species' in df.columns)
True

Similarly, if we wanted to see if the column age exists in a dataframe, we can write:

>>> print('age' in df.columns)
False

Conclusion

In this post, you learned how to get a list of columns from a Pandas dataframe. You learned different ways to create a list out of this, sort that list alphabetically, and see whether or not a column exists in a given Pandas dataframe.

To learn how to rename Pandas dataframe columns, check out my post here.

To learn about the Pandas .columns attribute, check out the official documentation here.