Converting UNIX timestamp to readable datetime format in Pandas
A few days ago, we had to deal with some Pandas datetime data.
Looking at the two columns called “start_time_ms” and “end_time_ms”, you wonder how we can interpret them. I then realized that the time format in the Pandas DataFrame is UNIX epoch timestamp in milliseconds.
Our goal is to make the timestamps be more human-readable. There are many ways to convert this. I first tried out this method, the output seemed good and anyone can easily understand the start time & end time for each row.
A few minutes later, the lead engineer questioned me, “Are these time conversions right?” After doing several checking, we found out that the timestamp values didn’t match with the actual time when the data came in.
We figured out that we had to convert the UNIX epoch time to datetime with the specific timezone when the data was generated, as shown in this link by using tz_localize()
and tz_convert()
To truncate the timestamps to seconds and remove the timezone information, you can use pd.DateTimeIndex.floor
and tz_localize(None)
as shown here