Use Pandas to Drop Columns and Rows

  • by
Pandas Drop Columns and Rows Tutorial Cover Image
  • Save

Working with bigger dataframes, you’ll find yourself wanting to use Pandas to drop columns or rows.

Pandas has a number of different ways to do this. In this post, you’ll learn all you need to know about the drop function.

In particular, you’ll learn:

Putting Together the Dataframe

To get started, let’s put together a sample dataframe that you can use throughout the rest of the tutorial. Take a look at the code below to put together the dataframe:

df = pd.DataFrame({'Name': ['Nik', 'Jim', 'Alice', 'Jane', 'Matt', 'Kate'],
                   'Score': [100, 120, 96, 75, 68, 123],
                   'Height': [178, 180, 160, 165, 185, 187],
                   'Weight': [180, 175, 143, 155, 167, 189]})
print(df.head())

By using the df.head() function, you can see what the dataframe’s first five rows look like:

Name Score Height Weight
0 Nik 100 178 180
1 Jim 120 180 175
2 Alice 96 160 143
3 Jane 75 165 155
4 Matt 68 185 167

How to use the drop function in Pandas

The Pandas drop function is a helpful function to drop columns and rows. Let’s take a quick look at how the function works:

DataFrame.drop(self, labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')

Let’s look at what the arguments mean:

  • labels: either a label or list of rows or columns to drop
  • axis: defaults to 0 for rows, enter 1 for columns
  • index: either a label or list of rows to drop (by their index)
  • columns: either a label or a list of columns to drop
  • level: if the dataframe is a MultiIndex, which level to drop from
  • inplace: defaults to False, meaning it must be re-assigned
  • errors: defaults to raise, meaning errors won’t be suppressed

Throughout this tutorial, we’ll focus on the axis, index, and columns arguments.

How to drop columns in Pandas

Drop a Single Column in Pandas

There are multiple ways to drop a column in Pandas using the drop function.

If you wanted to drop the Height column, you could write:

df = df.drop('Height', axis = 1)
print(df.head())

This prints out:

Name Score Weight
0 Nik 100 180
1 Jim 120 175
2 Alice 96 143
3 Jane 75 155
4 Matt 68 167

Personally, I find the axis argument a little awkward.

You can use the columns argument to not have to specify and axis at all:

df = df.drop(columns='Height')
print(df.head())

This prints out the exact same dataframe as above:

Name Score Weight
0 Nik 100 180
1 Jim 120 175
2 Alice 96 143
3 Jane 75 155
4 Matt 68 167

Drop Multiple Columns in Pandas

In order to drop multiple columns, follow the same steps as above, but put the names of columns into a list.

If you wanted to drop the Height and Weight columns, this could be done by writing either of the codes below:

df = df.drop(columns=['Height', 'Weight'])
print(df.head())

or write:

df = df.drop(['Height', 'Weight'], axis = 1)
print(df.head())

Both of these return:

How to drop rows in Pandas

Pandas also makes it easy to drop rows in Pandas using the drop function.

Drop a Single Row in Pandas

To drop a single row in Pandas, you can use either the axis or index arguments in the drop function.

Let’s try dropping the first row (with index = 0). This can be done by writing either:

df = df.drop(0)
print(df.head())

or write:

df = df.drop(index=0)
print(df.head())

Both of these return the following dataframe:

	Name	Score	Height	Weight
1	Jim	120	180	175
2	Alice	96	160	143
3	Jane	75	165	155
4	Matt	68	185	167
5	Kate	123	187	189

Drop Multiple Rows in Pandas

To drop multiple rows in Pandas, you can specify a list of indices (row numbers) into the drop function.

Let’s drop the first, second, and fourth rows. This can be done by writing:

df = df.drop([0,1,3])
print(df.head())

or you can write:

df = df.drop(index = [0,1,3])
print(df.head())

Both of these return the same dataframe:

	Name	Score	Height	Weight
2	Alice	96	160	143
4	Matt	68	185	167
5	Kate	123	187	189

Check out some other Python tutorials on datagy, including our complete guide to styling Pandas and our comprehensive overview of Pivot Tables in Pandas!

How to drop columns if it contains a certain value in Pandas

You can use the drop function to drop all columns that contain a certain value or string.

For example, in our dataframe, if you wanted to drop the Height and Weight columns, you could check if the string ‘eight’ is in any of the columns. Try writing the following code:

for column in df.columns:
    if 'eight' in column:
        df.drop(columns=column, inplace=True)

print(df.head())

Let’s take a look at what is happening in this code:

  • The for loop iterates over each item in the list that df.columns generates
  • Each iteration checks if ‘eight’ is in the item,
  • If it is, the column is dropped.
  • Note: we use the inplace argument in order to not have to reassign the dataframe

If you want to learn all you need to know about For Loops in Python, check out our comprehensive guide here.

This code returns the following dataframe:

Name Score
0 Nik 100
1 Jim 120
2 Alice 96
3 Jane 75
4 Matt 68
Name Score
0 Nik 100
1 Jim 120
2 Alice 96
3 Jane 75
4 Matt 68

How to drop rows if it contains a certain value in Pandas

Pandas makes it easy to drop rows based on a condition.

For example, if we wanted to drop any rows where the weight was less than 160, you could write:

df = df.drop(df[df['Weight'] < 160].index)
print(df)

This returns the following:

Name Score Height Weight
0 Nik 100 178 180
1 Jim 120 180 175
4 Matt 68 185 167
5 Kate 123 187 189

Let’s explore what’s happening in the code above:

  • df[df[‘Weight’ < 160].index evaluates to a list of the indices where the weight is less than 160
  • This is then passed into the drop function to drop those rows

Dropping Rows on Multiple Conditions

This can also be done for multiple conditions using either | (for or) or & (for and).

If you wanted to drop all records where the Weight was less than 160 or the Height was less than 180, you could write:

df = df.drop(df[(df['Weight'] < 160) | (df['Height'] < 180)].index)
print(df.head())

This would return the following:

Name Score Height Weight
1 Jim 120 180 175
4 Matt 68 185 167
5 Kate 123 187 189

How to drop a column by column number

To drop columns using the column number, you can use the iloc selector.

For example, if you wanted to drop columns of indices 1 through 3, you could write the following code:

df = df.drop(df.iloc[:, 1:3], axis = 1) 
print(df)

To learn more about the iloc select (and all the other selectors!), check out this comprehensive guide to 4 Ways to Use Pandas to Select Columns in a Dataframe.

This returns the following dataframe:

Name Weight
0 Nik 180
1 Jim 175
2 Alice 143
3 Jane 155
4 Matt 167
5 Kate 189

Conclusion

Thanks for reading all the way to here!

In this tutorial, we learned how to use the drop function in Pandas. Specifically, we learned how to drop single columns/rows, multiple columns/rows, and how to drop columns or rows based on different conditions.

If you still want to dive a little deeper into the drop function, check out the official documentation.

Cover of Introduction to Python for Data Science
  • Save

Want to learn Python for Data Science? Check out my ebook!