In this post, you’ll learn how to use Pandas to add days to a date column. You’ll learn how to add a constant number of days to a column as well as how to add days to a column based on another column. Finally, you’ll learn how to work with specific timezone formats that will take into account working with daylight savings time. Let’s get started!
Working with datetime values in Pandas can be an intimidating process! Learn how to complete this common function with an easy-to-follow tutorial.
The Short Answer: Use Pandas pd.timedelta()
Table of Contents
Loading a Sample Pandas Dataframe
If you don’t have a dataset to practise with but would like to follow along, feel free to use the sample dataframe provided below.
import pandas as pd
df = pd.DataFrame(
{'Name': ['Jane', 'Melissa', 'John', 'Matt'],
'Arrival Date': ['2021-08-01', '2021-04-24', '2021-09-30', '2021-07-13'],
'Stay Length': [12, 15, 7, 6]})
df['Arrival Date'] = pd.to_datetime(df['Arrival Date'])
print(df)
Note that we’ve used the Pandas .to_datetime()
function to convert the Arrival Date
column to datetime. This can be helpful when Pandas doesn’t automatically parse columns as the datetime type.
Printing this dataframe returns the following:
Name Arrival Date Stay Length
0 Jane 2021-08-01 12
1 Melissa 2021-04-24 15
2 John 2021-09-31 7
3 Matt 2021-07-13 6
Let’s imagine that this data represents people checking into a resort. We can see here that we have three columns:
Name
: which provides a person’s name,Arrival Date
: the date the person arrived, andStay Length
: how long the person stayed
Use Pandas to Add Constant Days to a Date Column
In this section, you’ll learn how to use Pandas to add constant days to a date column. This can be very helpful when you want to add a predetermined number of days to a column. This can be helpful when you’re trying to determine, say, eligibility for a row based on when an event occurred.
Let’s see how we can use the Pandas .Timedelta()
function to add a constant number of days to a column:
df['Two weeks later'] = df['Arrival Date'] + pd.Timedelta(days=14)
print(df)
What we’ve done here is:
- Created a new column called
Two weeks later
that is meant to house information on 14 days following an event - We add a pd.Timedelta() object directly to the
Arrival Date
column - In the Timedelta() object, we pass in
days=14
to the object
We need to take this step, rather than, say, simply adding 14, because Pandas needs to be told that we want to add 14 days to our column.
Now let’s take a look at how you can add days to a column based on the values of another column.
Use Pandas to Add Days to a Date Column based on Another Column
There may be times when you want to use Pandas to add days to a column based on the values of another column. For example, you may be given the start date of something, a column with the number of days, and need to calculate the end date.
If you’ve been following along with our example, you may want to know when the guest checked out. This can be calculated by taking the Arrival Date
column and adding the Stay Length
column. However, as discussed in the earlier section, we can’t simply add the two columns together.
Let’s see how we can use Python and Pandas to accomplish this:
df['Departure Date'] = df['Arrival Date'] + pd.to_timedelta(df['Stay Length'], unit='D')
print(df)
This returns the following Pandas Dataframe:
Name Arrival Date Stay Length Departure Date
0 Jane 2021-08-01 12 2021-08-13
1 Melissa 2021-04-24 15 2021-05-09
2 John 2021-09-30 7 2021-10-07
3 Matt 2021-07-13 6 2021-07-19
What we’ve done here is used the Pandas .to_timedelta()
method to convert the column Stay Length
to a timedelta object. The way that we’ve done this is by passing in the column as its argument, and passing in 'D'
for days into the unit=
parameter.
What is the Difference between .Timedelta() and .DateOffset()?
Pandas offers to different functions to add days: the Pandas .Timedelta()
function and the Pandas .DateOffset()
function. It may not be immediately apparent what the different between these two functions.
The Pandas .Timedelta()
function will always treat a day as 24 hours, even when a day may actually be 23 or 25 hours (say during daylight savings time). The .DateOffset()
function is timezone dependent and will factor this in.
If working with timezones is critical to your work and you need your functions to be aware of this, use the Pandas .DateOffset()
function.
Conclusion
In this post, you learned how to use Pandas to add days to a column, both by adding a constant number of days to a column as well as adding days to a column based on the value of another column. Finally, you learned how to understand the difference between the Pandas .Timedelta()
function and the Pandas .DateOffset()
function.
To learn more about Pandas timedelta, check out the official documentation. To learn more about the Pandas DateOffset function, you can check out the official documentation here.