In this tutorial, you’ll learn how to use the Pandas to_datetime function to convert a Pandas column to date time. Pandas provides a huge number of methods and functions that make working with dates incredibly versatile. However, data aren’t always read correctly.
By the end of this tutorial, you’ll have learned:
- How to use the Pandas to_datetime function to convert a column to date time
- How to specify format codes to convert different formats of strings
- Convert integers of seconds and dates to datetime in Pandas
Let’s get started!
Table of Contents
Understanding the Pandas to_datetime Function
Let’s begin by taking a look at the Pandas to_datetime()
function, which allows you to pass in a Pandas Series to convert it to datetime. The function provides a large number of versatile parameters that allow you to customize the behavior.
Let’s take a look at these parameters:
# Understanding the Pandas to_datetime() Function
pd.to_datetime(
arg, # The column to convert
errors='raise', # How to handle errors
dayfirst=False, # Whether days come first
yearfirst=False, # Whether years come first
utc=None, # Whether to control UTC Timezone handling
format=None, # The strftime format to parse
exact=True, # Whether to search for an exact match
unit=None, # The unit to use if an integer timestamp is used
infer_datetime_format=False, # Whether to infer the format or not
origin='unix', # Defines the reference date
cache=True # Whether to use a cache to speed up equal dates
)
As you can see the function has a huge number of parameters available. In this tutorial, we’ll focus on the primary ones that cover most of the use cases of the function.
Use Pandas to_datetime to Convert a Column to Date Time
Let’s start by simply converting a string column to date time. We can load the Pandas DataFrame below and print out its data types using the info()
method:
import pandas as pd
df = pd.DataFrame(['2022-04-01',
'2022-04-02', '2022-04-03', '2022-04-04', '2022-04-05'], columns=['Date'])
print(df.info())
# Returns:
# <class 'pandas.core.frame.DataFrame'>
# RangeIndex: 5 entries, 0 to 4
# Data columns (total 1 columns):
# # Column Non-Null Count Dtype
# --- ------ -------------- -----
# 0 Date 5 non-null object
# dtypes: object(1)
# memory usage: 168.0+ bytes
# None
While the data looks like dates, it’s actually formatted as strings. Let’s see how we can use the Pandas to_datetime function to convert the string column to a date time.
# Converting a Column to Date Time in Pandas
import pandas as pd
df = pd.DataFrame(['2022-04-01',
'2022-04-02', '2022-04-03', '2022-04-04', '2022-04-05'], columns=['Date'])
df['Date'] = pd.to_datetime(df['Date'])
print(df.info())
# Returns:
# <class 'pandas.core.frame.DataFrame'>
# RangeIndex: 5 entries, 0 to 4
# Data columns (total 1 columns):
# # Column Non-Null Count Dtype
# --- ------ -------------- -----
# 0 Date 5 non-null datetime64[ns]
# dtypes: datetime64[ns](1)
# memory usage: 168.0 bytes
# None
Pandas was able to infer the datetime format and correctly convert the string to a datetime data type. In the next section, you’ll learn how to specify specific formats.
Specify Datetime Formats in Pandas to_datetime
There will be many times when you receive a date column in a format that’s not immediately inferred by Pandas. In these situations, you can use the format=
parameter to specify a particular formatting string. These strings follow strftime conventions, which are consistent across many programming languages.
Let’s load a DataFrame with a strange formatting string:
# Loading a strange date format
import pandas as pd
df = pd.DataFrame(['2022-04-*-01', '2022-04-*-02', '2022-04-*-03', '2022-04-*-04', '2022-04-*-05'], columns=['Date'])
print(df)
# Returns:
# Date
# 0 2022-04-*-01
# 1 2022-04-*-02
# 2 2022-04-*-03
# 3 2022-04-*-04
# 4 2022-04-*-05
Let’s use the format=
parameter to specify a custom format:
# Converting a Column to Date Time in Pandas
import pandas as pd
df = pd.DataFrame(['2022-04-*-01', '2022-04-*-02', '2022-04-*-03', '2022-04-*-04', '2022-04-*-05'], columns=['Date'])
df['Date'] = pd.to_datetime(df['Date'], format='%Y-%m-*-%d')
print(df)
# Returns:
# Date
# 0 2022-04-01
# 1 2022-04-02
# 2 2022-04-03
# 3 2022-04-04
# 4 2022-04-05
We can see in the example above that specifying a custom format string, Pandas is able to correctly infer the date format.
Convert Unix Integer Days to DateTime in Pandas
Another powerful conversion that Pandas provides is to convert integers into Unix days. This can be done by passing in a Series of integers into a date time object. In order to do this, you need to provide a unit=
parameter to specify days, meaning that the integers represent a day from a specified origin.
By default, Pandas will use the 'unix'
origin, which represents January 1st, 1970 at 00:00:00 UTC. Let’s see how we can convert a column of integers into a datetime object:
# Converting an Integer to Date Time in Pandas
import pandas as pd
df = pd.DataFrame([12345, 23453, 12324, 34534, 12313], columns=['Date'])
df['Date'] = pd.to_datetime(df['Date'], unit='D')
print(df)
# Returns:
# Date
# 0 2003-10-20
# 1 2034-03-19
# 2 2003-09-29
# 3 2064-07-20
# 4 2003-09-18
Because Pandas uses 'unix'
as the default value for origin=
, we didn’t need to specify this.
Convert Integer Seconds to DateTime in Pandas
Similar to the example above, you may encounter situations where you need to convert integer values representing seconds. This works in the same way as the example above, except you’ll specify unit='s'
as a parameter:
# Converting an Integer Column to Date Time in Pandas
import pandas as pd
df = pd.DataFrame([12345, 23453, 12324, 34534, 12313], columns=['Date'])
df['Date'] = pd.to_datetime(df['Date'], unit='s')
print(df)
# Returns:
# Date
# 0 1970-01-01 03:25:45
# 1 1970-01-01 06:30:53
# 2 1970-01-01 03:25:24
# 3 1970-01-01 09:35:34
# 4 1970-01-01 03:25:13
In the final section, you’ll learn how to convert multiple Pandas columns to datetime.
Convert Multiple Pandas Columns to DateTime
To convert multiple columns to datetime in Pandas, you can combine the Pandas apply and to_datetime functions. The .apply()
method is applied to a section of multiple columns, and the to_datetime()
function into it.
Let’s see how we can do this using Pandas:
# Converting Multiple Columns to Date Time in Pandas
import pandas as pd
df = pd.DataFrame(
[['2022-04-01', '2022-05-01'],
['2022-04-02', '2022-05-02'],
['2022-04-03', '2022-05-03'],
['2022-04-04', '2022-05-04'],
['2022-04-05', '2022-05-05']], columns=['Date1', 'Date2'])
df[['Date1', 'Date2']] = df[['Date1', 'Date2']].apply(pd.to_datetime)
print(df)
# Returns:
# Date1 Date2
# 0 2022-04-01 2022-05-01
# 1 2022-04-02 2022-05-02
# 2 2022-04-03 2022-05-03
# 3 2022-04-04 2022-05-04
# 4 2022-04-05 2022-05-05
In the example above, the pd.to_datetime
function was passed into the .apply()
method to convert multiple columns to datetime.
Conclusion
In this tutorial, you learned how to use the Pandas to_datetime function to convert a column to datetime data types. You learned how to do this using strings and integers. You also learned how to convert multiple columns to a datetime data type using the .apply()
method.
Additional Resources
To learn more about related topics, check out the tutorials below: