Sorting data is an essential method to better understand your data. In this post, you’ll learn how to sort data in a Pandas dataframe using the Pandas .sort_values()
function, in ascending and descending order, as well as sorting by multiple columns. Specifically, you’ll learn how to use the by=
, ascending=
, inplace=
, and na_position=
parameters.
Video Tutorial
Table of Contents
- Loading the dataset and required libraries
- Exploring the Pandas Sort_Values() Function
- Sort Data in a Pandas Dataframe Column
- Sort Data in Multiple Pandas Dataframe Columns
- Change Sort Order in Pandas
- Pandas Sort_Values Na_Position Parameter
- Changing Sort Order In Place in Pandas Sort_Values
- Conclusion
Loading the dataset and required libraries
To start, let’s load the Pandas
library and a dataset created for this tutorial. We’ll print out the first five rows, using the .head()
method and take a quick look at the dataset:
import pandas as pd df = pd.read_excel('https://github.com/datagy/mediumdata/raw/master/sort_values.xlsx') print(df.head())
In the code above, you first imported the Pandas library, then used the .read_excel()
method to load a dataset. Finally, you printed the first five rows of the dataset using the .head()
method.
Let’s take a quick look at what the dataset looks like:
Date Name Score
0 2020-07-01 John 10
1 2020-07-01 Jane 35
2 2020-07-01 Matt 23
3 2020-07-02 John 19
4 2020-07-02 Jane 38
The dataset contains three columns: (1) Date, (2), Name, and (3) Score. Specifically, these columns are made up of datetime, string, and integer datatypes, meaning we have a large variety of sorting options!
Exploring the Pandas Sort_Values() Function
Now that you’ve loaded the Pandas library and assigned a dataset to the dataframe df
, let’s take a look at some of the key parameters available in the Pandas .sort_values()
function:
DataFrame.sort_values(by, ascending=True, inplace=False, na_position='last')
The .sort_value()
function is applied directly to a DataFrame object and take more arguments than listed above, but these are the key ones found in most applications.
Let’s a take a deeper look at these:
- The
by=
argument identifies, which column or columns to use to sort your data, - The
ascending=
argument defaults to True and setting it to False will sort your data in descending order, - The
inplace=
argument will modify the DataFrame object when set to False, without having to reassign it, - The
na_position=
defaults to ‘last’, but changing to ‘first’ will place those values first in sorting order.
Now let’s dive into actually sorting your data.
Sort Data in a Pandas Dataframe Column
The most important parameter in the .sort_values()
function is the by=
parameter, as it tells Pandas which column(s) to sort by. Let’s say you wanted to sort the DataFrame df
you created earlier in the tutorial by the Name column. You could then write:
df.sort_values(by='Name')
Here, you’ve applied the .sort_values()
method to the DataFrame object, df
. You’ve also applied the by='Name'
parameter and argument.
This returns the following printout, which I’ve truncated to five records to save space:
Date Name Score
1 2020-07-01 Jane 35
4 2020-07-02 Jane 38
7 2020-07-03 Jane 34
0 2020-07-01 John 10
3 2020-07-02 John 19
With this, you’ve sorted your dataset by the Name column in ascending order.
Sort Data in Multiple Pandas Dataframe Columns
In the example above, you sorted your dataframe by a single column. You can sort your data by multiple columns by passing in a list of column items into the by=
parameter. Let’s try this again by sorting by both the Name and Score columns:
df.sort_values(by=['Name', 'Score'])
Again, let’s take a look at what this looks like when it’s returned:
Date Name Score
7 2020-07-03 Jane 34
1 2020-07-01 Jane 35
4 2020-07-02 Jane 38
0 2020-07-01 John 10
3 2020-07-02 John 19
You can see here that the dataframe is first sorted by the Name column (meaning Jane precedes John, and John precedes Matt), then for each unique item in the Name column, the values in the Score column are further sorted in ascending order.
Change Sort Order in Pandas
By default, the .sort_values()
method will sort values in ascending order – but you may wish to change the sort order to descending.
Let’s take a look at how to do this. We’ll sort the dataframe again first by the Name and Score columns, but this time add in the ascending=False
argument:
df.sort_values(by=['Name', 'Score'], ascending=False)
Here, you’re sorting the data by the Name and Score columns, but in descending order:
Date Name Score
2 2020-07-01 Matt 23
8 2020-07-03 Matt 8
6 2020-07-03 John 36
3 2020-07-02 John 19
0 2020-07-01 John 10
This is really handy, but say you wanted to sort columns in different orders. To do this, you would simply pass a list of orders into the ascending=
argument.
Let’s try this by sorting the Name column in ascending order and Score column in descending order:
df.sort_values(by=['Name', 'Score'], ascending=[True, False])
This returns the following dataframe, with the Name column sorted in ascending order and the Score column sorted in descending order:
Date Name Score
4 2020-07-02 Jane 38
1 2020-07-01 Jane 35
7 2020-07-03 Jane 34
6 2020-07-03 John 36
3 2020-07-02 John 19
Now let’s take a look at how to change the sort order of missing values.
Pandas Sort_Values Na_Position Parameter
By default, Pandas will sort any missing values to the last position. In order to change this behavior, you can use the na_position='first'
argument.
Let’s try this out by sorting the Name column and placing missing values first:
df.sort_values(by='Name', na_position='first')
By applying this code, you’re generating the following dataframe:
Date Name Score
5 2020-07-02 NaN 34
1 2020-07-01 Jane 35
4 2020-07-02 Jane 38
7 2020-07-03 Jane 34
0 2020-07-01 John 10
Finally, let’s see how to apply the change in sort order in place.
Check out some other Python tutorials on datagy, including our complete guide to styling Pandas and our comprehensive overview of Pivot Tables in Pandas!
Changing Sort Order In Place in Pandas Sort_Values
All of the examples you’ve learned above haven’t actually been applied to the dataframe itself, meaning that the dataframe object hasn’t actually been modified.
You could reassign the dataframe (such as, to itself), or you can modify the dataframe directly by using the inplace=
argument.
Let’s change the sort order and apply the changes in place:
df.sort_values(by='Name', inplace=True)
This has now modified the dataframe, meaning that if you now print the head of the dataframe using the .head()
method, you’d receive the following:
print(df.head())
Date Name Score
1 2020-07-01 Jane 35
4 2020-07-02 Jane 38
7 2020-07-03 Jane 34
0 2020-07-01 John 10
3 2020-07-02 John 19
Conclusion
In this post, you learned how to use the Pandas sort_values() function to sort data in a Pandas dataframe. Specifically, you learned how to sort by a single or by multiple columns, how to change the sort order, how to place missing values at the tail or the head, and how to change the sort order in place.
To learn more about the function, check out the official documentation here.

Want to learn Python for Data Science? Check out my ebook for as little as $10!