Creating date ranges with Pandas can significantly speed up your workflow when you need to iterate over a number of dates. For example, when we run queries on APIs or databases, we may need to generate a list of dates that can be iterated over. This can be a time-consuming task, but thankfully we can easily accomplish this with Pandas.
The Pandas Date Range Function
Let’s take a look at the Pandas date_range function available in Pandas. For the purposes of this tutorial, we’ll look at a reduced number of the parameters:
pd.date_range(
start=None,
end=None,
periods=None,
freq=None
)
- The start and end parameters look for strings or datetime-like objects.
- The periods parameter identifies the number of periods to generate.
- The frequency parameter identifies the number of periods to generate.
Generating our First Date Range
Let’s say we want to create a list of the first of the month of every month in 2020. We could write the following code:
import pandas as pd
first2020 = pd.date_range(start='2020-01-01', end='2020-12-01', freq='MS')
We use the frequency of MS to signal that we want to return the start of the month. This would generate a list contains the following:
DatetimeIndex (['2020-01-01', '2020-02-01', '2020-03-01', '2020-04-01', '2020-05-01', '2020-06-01', '2020-07-01', '2020-08-01', '2020-09-01', '2020-10-01', '2020-11-01', '2020-12-01'],
dtype='datetime64[ns]', freq='MS')
Each item in this is a datetime object. If we want to generate this into a string, we could iterate over each item. For this, we can use a for loop, as shown below:
import pandas as pd
first2020 = pd.date_range(start='2020-01-01', end='2020-12-01', freq='MS')
list2020 = []
for i in first2020:
list2020.append(i.strftime('%Y-%m-%d'))
This returns a list containing:
['2020-01-01', '2020-02-01', '2020-03-01', '2020-04-01', '2020-05-01', '2020-06-01', '2020-07-01', '2020-08-01', '2020-09-01',
'2020-10-01', '2020-11-01', '2020-12-01']
Similarly, you could rewrite the entire code as a list comprehension:
import pandas as pd
first2020 = pd.date_range(start='2020-01-01', end='2020-12-01', freq='MS')
list2020 = [i.strftime('%Y-%m-%d') for i in first2020]
Some Other Helpful Frequencies
We may want to use some other frequencies:
- B: business day frequency
- D: calendar day frequency
- W: weekly frequency
- M: month-end frequency
Conclusion: Creating Date Ranges with Pandas
In this post we learned how to create handy date ranges with Pandas using Python. It’s a handy and quick way to help create iterable objects for querying APIs and databases. You can learn more about the function by visiting the official documentation.
Pingback: Pandas Diff: Calculate the Difference Between Pandas Rows • datagy