Skip to content

Pandas to_datetime: Convert a Pandas String Column to Date Time

Pandas to_datetime Convert a Column to Date Time Cover Image

In this tutorial, you’ll learn how to use the Pandas to_datetime function to convert a Pandas column to date time. Pandas provides a huge number of methods and functions that make working with dates incredibly versatile. However, data aren’t always read correctly.

By the end of this tutorial, you’ll have learned:

  • How to use the Pandas to_datetime function to convert a column to date time
  • How to specify format codes to convert different formats of strings
  • Convert integers of seconds and dates to datetime in Pandas

Let’s get started!

Understanding the Pandas to_datetime Function

Let’s begin by taking a look at the Pandas to_datetime() function, which allows you to pass in a Pandas Series to convert it to datetime. The function provides a large number of versatile parameters that allow you to customize the behavior.

Let’s take a look at these parameters:

# Understanding the Pandas to_datetime() Function
pd.to_datetime(
    arg,                            # The column to convert
    errors='raise',                 # How to handle errors
    dayfirst=False,                 # Whether days come first
    yearfirst=False,                # Whether years come first
    utc=None,                       # Whether to control UTC Timezone handling
    format=None,                    # The strftime format to parse
    exact=True,                     # Whether to search for an exact match
    unit=None,                      # The unit to use if an integer timestamp is used
    infer_datetime_format=False,    # Whether to infer the format or not
    origin='unix',                  # Defines the reference date
    cache=True                      # Whether to use a cache to speed up equal dates
    )         

As you can see the function has a huge number of parameters available. In this tutorial, we’ll focus on the primary ones that cover most of the use cases of the function.

Use Pandas to_datetime to Convert a Column to Date Time

Let’s start by simply converting a string column to date time. We can load the Pandas DataFrame below and print out its data types using the info() method:

import pandas as pd

df = pd.DataFrame(['2022-04-01',
'2022-04-02', '2022-04-03', '2022-04-04', '2022-04-05'], columns=['Date'])
print(df.info())

# Returns:
# <class 'pandas.core.frame.DataFrame'>
# RangeIndex: 5 entries, 0 to 4
# Data columns (total 1 columns):
#  #   Column  Non-Null Count  Dtype 
# ---  ------  --------------  ----- 
#  0   Date    5 non-null      object
# dtypes: object(1)
# memory usage: 168.0+ bytes
# None

While the data looks like dates, it’s actually formatted as strings. Let’s see how we can use the Pandas to_datetime function to convert the string column to a date time.

# Converting a Column to Date Time in Pandas
import pandas as pd

df = pd.DataFrame(['2022-04-01',
'2022-04-02', '2022-04-03', '2022-04-04', '2022-04-05'], columns=['Date'])
df['Date'] = pd.to_datetime(df['Date'])
print(df.info())

# Returns:
# <class 'pandas.core.frame.DataFrame'>
# RangeIndex: 5 entries, 0 to 4
# Data columns (total 1 columns):
#  #   Column  Non-Null Count  Dtype         
# ---  ------  --------------  -----         
#  0   Date    5 non-null      datetime64[ns]
# dtypes: datetime64[ns](1)
# memory usage: 168.0 bytes
# None

Pandas was able to infer the datetime format and correctly convert the string to a datetime data type. In the next section, you’ll learn how to specify specific formats.

Specify Datetime Formats in Pandas to_datetime

There will be many times when you receive a date column in a format that’s not immediately inferred by Pandas. In these situations, you can use the format= parameter to specify a particular formatting string. These strings follow strftime conventions, which are consistent across many programming languages.

Let’s load a DataFrame with a strange formatting string:

# Loading a strange date format
import pandas as pd

df = pd.DataFrame(['2022-04-*-01', '2022-04-*-02', '2022-04-*-03', '2022-04-*-04', '2022-04-*-05'], columns=['Date'])
print(df)

# Returns:
#            Date
# 0  2022-04-*-01
# 1  2022-04-*-02
# 2  2022-04-*-03
# 3  2022-04-*-04
# 4  2022-04-*-05

Let’s use the format= parameter to specify a custom format:

# Converting a Column to Date Time in Pandas
import pandas as pd

df = pd.DataFrame(['2022-04-*-01', '2022-04-*-02', '2022-04-*-03', '2022-04-*-04', '2022-04-*-05'], columns=['Date'])
df['Date'] = pd.to_datetime(df['Date'], format='%Y-%m-*-%d')

print(df)

# Returns:
#         Date
# 0 2022-04-01
# 1 2022-04-02
# 2 2022-04-03
# 3 2022-04-04
# 4 2022-04-05

We can see in the example above that specifying a custom format string, Pandas is able to correctly infer the date format.

Convert Unix Integer Days to DateTime in Pandas

Another powerful conversion that Pandas provides is to convert integers into Unix days. This can be done by passing in a Series of integers into a date time object. In order to do this, you need to provide a unit= parameter to specify days, meaning that the integers represent a day from a specified origin.

By default, Pandas will use the 'unix' origin, which represents January 1st, 1970 at 00:00:00 UTC. Let’s see how we can convert a column of integers into a datetime object:

# Converting an Integer to Date Time in Pandas
import pandas as pd

df = pd.DataFrame([12345, 23453, 12324, 34534, 12313], columns=['Date'])
df['Date'] = pd.to_datetime(df['Date'], unit='D')

print(df)

# Returns:
#         Date
# 0 2003-10-20
# 1 2034-03-19
# 2 2003-09-29
# 3 2064-07-20
# 4 2003-09-18

Because Pandas uses 'unix' as the default value for origin=, we didn’t need to specify this.

Convert Integer Seconds to DateTime in Pandas

Similar to the example above, you may encounter situations where you need to convert integer values representing seconds. This works in the same way as the example above, except you’ll specify unit='s' as a parameter:

# Converting an Integer Column to Date Time in Pandas
import pandas as pd

df = pd.DataFrame([12345, 23453, 12324, 34534, 12313], columns=['Date'])
df['Date'] = pd.to_datetime(df['Date'], unit='s')

print(df)

# Returns:
#                  Date
# 0 1970-01-01 03:25:45
# 1 1970-01-01 06:30:53
# 2 1970-01-01 03:25:24
# 3 1970-01-01 09:35:34
# 4 1970-01-01 03:25:13

In the final section, you’ll learn how to convert multiple Pandas columns to datetime.

Convert Multiple Pandas Columns to DateTime

To convert multiple columns to datetime in Pandas, you can combine the Pandas apply and to_datetime functions. The .apply() method is applied to a section of multiple columns, and the to_datetime() function into it.

Let’s see how we can do this using Pandas:

# Converting Multiple Columns to Date Time in Pandas
import pandas as pd

df = pd.DataFrame(
    [['2022-04-01', '2022-05-01'],
    ['2022-04-02', '2022-05-02'],
    ['2022-04-03', '2022-05-03'],
    ['2022-04-04', '2022-05-04'],
    ['2022-04-05', '2022-05-05']], columns=['Date1', 'Date2'])

df[['Date1', 'Date2']] = df[['Date1', 'Date2']].apply(pd.to_datetime)

print(df)

# Returns:
#        Date1      Date2
# 0 2022-04-01 2022-05-01
# 1 2022-04-02 2022-05-02
# 2 2022-04-03 2022-05-03
# 3 2022-04-04 2022-05-04
# 4 2022-04-05 2022-05-05

In the example above, the pd.to_datetime function was passed into the .apply() method to convert multiple columns to datetime.

Conclusion

In this tutorial, you learned how to use the Pandas to_datetime function to convert a column to datetime data types. You learned how to do this using strings and integers. You also learned how to convert multiple columns to a datetime data type using the .apply() method.

Additional Resources

To learn more about related topics, check out the tutorials below:

Leave a Reply

Your email address will not be published.