In this tutorial, you’ll learn how to use Pandas to get the row number (or, really, the index number) of a particular row or rows in a dataframe. There may be many times when you want to be able to know the row number of a particular value, and thankfully Pandas makes this quite easy, using the .index()
function.
Practically speaking, this returns the index
positions of the rows, rather than a row number as you may be familiar with in Excel. Because an index doesn’t really represent a row number, it doesn’t really represent a row number. That being said, Pandas doesn’t provide a true row number, so the index is closest match to this.
By the end of this tutorial, you’ll have learned:
- How to get the row number(s) for rows matching a condition,
- How to get only a single row number, and
- How to count the number of rows matching a particular condition
The Quick Answer: Use .index to Get a Pandas Row Number
Table of Contents
Loading a Sample Pandas Dataframe
To follow along with this tutorial, I have provided a sample Pandas Dataframe. If you want to follow along with the tutorial line by line, feel free to copy the code below. The dataframe is deliberately small so that it is easier to follow along with. Let’s get started!
# Loading a Sample Pandas Dataframe
import pandas as pd
df = pd.DataFrame.from_dict(
{
'Name': ['Joan', 'Devi', 'Melissa', 'Dave', 'Nik', 'Kate', 'Evan'],
'Age':[19, 43, 27, 32, 28, 29, 42],
'Gender': ['Female', 'Female', 'Female', 'Male', 'Male', 'Female', 'Male'],
'Education': ['High School', 'College', 'PhD', 'High School', 'College', 'College', 'College'],
'City': ['Atlanta', 'Toronto', 'New York City', 'Madrid', 'Montreal', 'Vancouver', 'Paris']
}
)
print(df)
# Returns:
# Name Age Gender Education City
# 0 Joan 19 Female High School Atlanta
# 1 Devi 43 Female College Toronto
# 2 Melissa 27 Female PhD New York City
# 3 Dave 32 Male High School Madrid
# 4 Nik 28 Male College Montreal
# 5 Kate 29 Female College Vancouver
# 6 Evan 42 Male College Paris
We can see that when we print the dataframe that we have a dataframe with six rows and five columns. Our columns contain completely unique variables and others that are more categorical.
In the next section, you’ll learn how to get the row numbers that match a condition in a Pandas Dataframe.
Check out some other Python tutorials on datagy, including our complete guide to styling Pandas and our comprehensive overview of Pivot Tables in Pandas!
Get Row Numbers that Match a Condition in a Pandas Dataframe
In this section, you’ll learn how to use Pandas to get the row number of a row or rows that match a condition in a dataframe.
We can use conditional Pandas filtering (which I cover off in detail in this tutorial) to filter our dataframe and then select the index, or indices, of those rows. Let’s see how we can get the row numbers for all rows containing Males in the Gender column.
# Get the Row numbers matching a condition in a Pandas dataframe
row_numbers = df[df['Gender'] == 'Male'].index
print(row_numbers)
# Returns:
# Int64Index([3, 4, 6], dtype='int64')
We can see here that this returns three items: the indices for the rows matching the condition.
Now, let’s see how we can return the row numbers for rows matching multiple conditions. With this, we can use conditional filtering, by passing into multiple conditions. Let’s select rows where the conditions match being both Female and from Toronto:
# Get the Row numbers matching multiple conditions in a Pandas dataframe
row_numbers = df[(df['Gender'] == 'Female') & (df['City'] == 'Toronto')].index
print(row_numbers)
# Returns:
# Int64Index([1], dtype='int64')
We can see here that we were able to return the row numbers of a Pandas Dataframe that matches two conditions.
In the next section, you’ll learn how to use Pandas to get the first row number that matches a condition.
Get the First Row Number that Matches a Condition in a Pandas Dataframe
There may be times when you want to get only the first row number that matches a particular condition. This could be, for example, if you know how that only a single row will match this condition.
We say above, that we returned a Int64Index
object, which is an indexable object. Because of this, we can easily access the index of the row number. Let’s see how:
# Get the row number of the first row that matches a condition
row_numbers = df[df['Name'] == 'Kate'].index[0]
print(row_numbers)
# Returns: 5
We can see here, that when we index the index object we return just a single row number. This allows us to access and use this index position in different operations. For example, we could then use the row number to modify content within that record or be able to extract it programmatically.
In the next section, you’ll learn how to count the number of rows that match a condition.
Count the Number of Rows Matching a Condition
You may also find yourself in a situation where you need to be able to identify how many rows match a certain condition. This could be a helpful first step, for example, in identifying uniqueness of a row, if you want to make sure only a single row matches a given condition.
When we used the .index
method above, we noticed that it returned a list-like object containing our row numbers. Because of this, we can pass this object into the len()
function to count how many items exist in the array.
Let’s see how we can repeat an example above and count how many rows match that condition using Pandas:
# Count number of rows matching a condition
row_numbers = df[(df['Gender'] == 'Female') & (df['City'] == 'Toronto')].index
print(len(row_numbers))
# Returns: 1
We can see that by passing in the index object into the len()
function, that we can confirm that only a single item matches our condition. This allows us to check for duplicates based on what we might assume to be a unique key. Otherwise, it may allow us to confirm whether enough rows match a given condition.
Conclusion
In this tutorial, you learned how to use Pandas to get the row numbers of a Pandas Dataframe that match a given condition. You also learned to get the row numbers of a rows that match multiple conditions. Finally, you learned how to use Pandas count the number of rows that match a given condition.
To learn more about the Pandas .index method, check out the official documentation here.
Pingback: Introduction to Pandas for Data Science • datagy