Pandas to_datetime() Usage Explained [Practical Examples]

Getting started with pandas to_datetime function

This function converts a scalar, array-like, Series or DataFrame/dict-like to a pandas datetime object. The function accepts an iterable object (such as a Python list, tuple, Series, or index), converts its values to datetimes, and returns the new values in a DatetimeIndex.

Syntax:

Advertisement
pandas.to_datetime( dayfirst=False, yearfirst=False, utc=None, format=None)

Parameters:

  • dayfirst - It is a boolean value, that represents true or false, will get the day first when it is true.
  • yearfirst  - It is a boolean value, that represents true or false, will get the year first when it is true.
  • utc - It is used to get the UTC based on the time provided
  • format - It is used to format the string in the given format -

%d represents date. %m represents month and % y represents the year.

 

Example-1. Convert String to DateTime

We can take a simple date output string and convert it to datetime. Consider this example where I have defined a date and then converted it to datetime output:

import pandas as pd

# Define string
date = '04/03/2021 11:23'

# Convert string to datetime format
date1 = pd.to_datetime(date)

# print to_datetime output
print(date1)

# print day, month and year separately from the to_datetime output
print("Day: ", date1.day)
print("Month", date1.month)
print("Year", date1.year)

Output:

2021-03-04 11:23:00
Day:  4
Month 3
Year 2021

 

Example-2. Convert Series to DateTime

Here we have a Panda Series which we will convert to datetime format:

import pandas as pd

# Define Panda Series
times = pd.Series(["2021-01-25", "2021/01/08", "2021", "Jan 4th, 2022"])

# Print Series
print("Series: \n", times, "\n")

# Convert Series to datetime
print("datetime: \n", pd.to_datetime(times))

Output:

Advertisement

As you can see, our Series contains date in different format which are all converted into datetime format:

Series: 
0       2021-01-25
1       2021/01/08
2             2021
3    Jan 4th, 2022
dtype: object 

datetime: 
0   2021-01-25
1   2021-01-08
2   2021-01-01
3   2022-01-04
dtype: datetime64[ns]

 

Example-3. Handling exceptions during datetime conversion

But what would happen if the Series contains normal text instead of datetime, in such case the to-datetime will raise exception. For example, I have updated my Series to pd.Series(["2021-01-25", "2021/01/08", "2021", "Hello World", "Jan 4th, 2022"])

When we try to convert this to_datetime, we get following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/pandas/core/arrays/datetimes.py", line 2192, in objects_to_datetime64ns
values, tz_parsed = conversion.datetime_to_datetime64(data.ravel("K"))
  File "pandas/_libs/tslibs/conversion.pyx", line 359, in pandas._libs.tslibs.conversion.datetime_to_datetime64
TypeError: Unrecognized value type: <class 'str'>

During handling of the above exception, another exception occurred:

So, to handle this we must use errors = 'coerce' which will convert all section which to_datetime fails to convert to NaT i.e. Not A Time

Let's update our code:

import pandas as pd

# Define Panda Series
times = pd.Series(["2021-01-25", "2021/01/08", "2021", "Hello World", "Jan 4th, 2022"])

# Print Series
print("Series: \n", times, "\n")

# Convert Series to datetime
print("datetime: \n", pd.to_datetime(times, errors = 'coerce'))

Output:

As you can see, now the Hello World is replaced with NaT as to_datetime was unable to convert that field.

Series: 
0       2021-01-25
1       2021/01/08
2             2021
3      Hello World
4    Jan 4th, 2022
dtype: object 

datetime: 
0   2021-01-25
1   2021-01-08
2   2021-01-01
3          NaT
4   2022-01-04
dtype: datetime64[ns]

 

Example-4. Convert Unix times to DateTime

A Unix represents is a way to store time in seconds, and I believe it represents the number of seconds since January 1st 1970, I think, at midnight. And so by storing the datetime as a number of seconds, it's very easy to convert that number of seconds into a specific date and time without running into any kind of formatting issues with dashes and slashes and all kinds of funky symbols.

import pandas as pd

# Define Panda Series
times = pd.Series([1349720105, 1349806505, 1349979305, 1350065705])

# Convert Series to datetime
print("datetime:\n", pd.to_datetime(times, unit = "s"))

Output:

datetime:
0   2012-10-08 18:15:05
1   2012-10-09 18:15:05
2   2012-10-11 18:15:05
3   2012-10-12 18:15:05
dtype: datetime64[ns]

 

Example-5. Using format with to_datetime

Now to_datetime will automatically identify the day, month and year but there may be situations where the provided nut may not be in standard format.

For example, I will define my date string in "%M-%D-%Y" format i.e. month-day-year. In such case, if we only want to access the month, then to_datetime() may not be able to give proper data. So in such case we use .format to define the format in which input has been provided to_datetime().

import pandas as pd

# Define string
date = '05/03/2021 11:23'

# Convert string to datetime and define the format
date1 = pd.to_datetime(date, format='%m/%d/%Y %H:%M')

# print to_datetime output
print(date1)

# print individual field
print("Day: ", date1.day)
print("Month", date1.month)
print("Year", date1.year)

Output:

2021-05-03 11:23:00
Day:  3
Month 5
Year 2021

 

Example-6. Convert range of date to DateTime

Example: In this example, we are converting the existing dataframe to datetime using to_datetime() function.

# import pandas 
import pandas 

# create dates in the range with 12 and Hours
data= pandas.date_range('1/1/2022', periods = 12, freq ='H')

# display
dataframe = pandas.DataFrame(data,columns=['date'])

#convert to datetime
print(pandas.to_datetime(dataframe['date']))

Output:

0    2022-01-01 00:00:00
1    2022-01-01 01:00:00
2    2022-01-01 02:00:00
3    2022-01-01 03:00:00
4    2022-01-01 04:00:00
5    2022-01-01 05:00:00
6    2022-01-01 06:00:00
7    2022-01-01 07:00:00
8    2022-01-01 08:00:00
9    2022-01-01 09:00:00
10   2022-01-01 10:00:00
11   2022-01-01 11:00:00
Name: date, dtype: datetime64[ns]

 

Example-7. Change the format of to_datetime() output

We can use dt.strftime to change the output format of to_datetime() function.

The format starts with % symbol.

  1. %d represents date
  2. %m represents month
  3. %Y represents year

Example  : In this example we are displaying the datetime in "%d-%m-%Y%I:%M %p" format.

import pandas as pd

# Define a dataframe
df = {"Country": ["IND", "CAL", "LON"],
      "date": ["04/11/2022 09:13:55 AM", "05/10/2022 11:31:05 PM", "12/08/2022 08:00:00 AM"]}

df = pd.DataFrame(df)

# Define the format to be used. format1 is the format from to_datetime while format2 is the new output format
format1 ="%m/%d/%Y %I:%M:%S %p"
format2 = "%m-%d-%Y %H:%M:%S"

# Convert and store datetime in new format
df['date'] = pd.to_datetime(df['date'], format=format1).dt.strftime(format2)

# Print new datetime format
print(df)

Output:

Country                 date
0     IND  04-11-2022 09:13:55
1     CAL  05-10-2022 23:31:05
2     LON  12-08-2022 08:00:00

 

Example-8. Remove time from to_datetime() output (Print only date)

In this scenario, we will discuss how to remove time from the converted datetime. We have to mention dt.date to get only date without time.

Syntax:

pandas.to_datetime(dataframe['column'].dt.date)

where,

  1. dataframe is the input dataframe
  2. column is the column name that includes datetime values

Example:

In this example, we are removing time from the datetime with to_datetime() for the above dataframe

import pandas as pd

# Define a dataframe
df = {"Country": ["IND", "CAL", "LON"],
      "date": ["04/11/2022 09:13:55 AM", "05/10/2022 11:31:05 PM", "12/08/2022 08:00:00 AM"]}

df = pd.DataFrame(df)

# Remove time and only store the date
df['date'] = pd.to_datetime(df['date']).dt.date

# Print date without time
print(df)

Output:

Country        date
0     IND  2022-04-11
1     CAL  2022-05-10
2     LON  2022-12-08

 

Example-9. Parse month name with to_datetime()

Here we are converting month name to timestamp(date,time and hours) using to_datetime(). The input is month name followed by day and year.

Format:

Monthname day, year

Example:

In this example, we are parsing following month names with to_datetime() function.

# import module
import pandas

# convert month name datetime
print(pandas.to_datetime("January 5, 2022"))

# convert month name datetime
print(pandas.to_datetime("January 3, 2022"))

# convert month name datetime
print(pandas.to_datetime("May 5, 2022"))

# convert month name datetime
print(pandas.to_datetime("December 5, 2022"))

# convert month name datetime
print(pandas.to_datetime("July 24, 2022"))

Output:

2022-01-05 00:00:00
2022-01-03 00:00:00
2022-05-05 00:00:00
2022-12-05 00:00:00
2022-07-24 00:00:00

 

Example-10. Add timezone to_datetime() output

We can get the timezone using tz_localize() method after converting the date data into datetime with to_datetime() method.

Syntax:

dataframe.column.dt.tz_localize('zone_name')

where,

  1. dataframe is the input dataframe
  2. column refers to datetime column
  3. zone_name is the timezone - like asia/kolkata, UTC etc..

 

Example: Add timezone to_datetime() data

import pandas as pd

# Define a dataframe
df = {"Country": ["IND", "CAL", "LON"],
      "date": ["04/11/2022 09:13:55 AM", "05/10/2022 11:31:05 PM", "12/08/2022 08:00:00 AM"]}

df = pd.DataFrame(df)

# Convert to_datetime and add timezone
df['date'] = pd.to_datetime(df['date']).dt.tz_localize('UTC')

# Print dataframe
print(df)

Output:

Country                      date
0     IND 2022-04-11 09:13:55+00:00
1     CAL 2022-05-10 23:31:05+00:00
2     LON 2022-12-08 08:00:00+00:00

 

Summary

In this tutorial we covered different examples of to_datetime() function in Python Pandas. We covered following topics basically:

  • Convert, series, strings and dataframe into DateTime Index using to_datetime()
  • Modify the output format of to_datetime() using dt.strftime()
  • Print only date from to_datetime() output (Remove time)
  • Access month, day and year field by using format with to_datetime()

 

References

to_datetime() in Pandas

 

Didn't find what you were looking for? Perform a quick search across GoLinuxCloud

If my articles on GoLinuxCloud has helped you, kindly consider buying me a coffee as a token of appreciation.

Buy GoLinuxCloud a Coffee

For any other feedbacks or questions you can either use the comments section or contact me form.

Thank You for your support!!

Leave a Comment

X