Sunday, 25 October 2009

How to do daylight saving time properly

EU switched to standard time this morning, and as always, this causes trouble. Clocks have to be adjusted, computers can get confused in their scheduling, and IT systems failed during the night. My harddisk TV recorder did not have any TV program after 02:59 but had a proper error message for each TV channel. Here is how you can create software that does not have these kinds of problems:

The first thing to realize, is that all timestamps in your software must contain the offset to UTC. If you have a simple TDateTime, it does not contain that, so TDateTime is simply not good enough. Because the Windows API is focused on timestamps that are not UTC compatible, and because the Windows API was never meant to be used with UTC-offset timestamps in the future or in the past, we can look at alternatives. Linux does a very good job at handling all this, so this blog post will explain how to do it the standard Linux way.

First, transport and store all timestamps as "the number of seconds since xyz", where january 1st, 00:00 UTC is a good choice. Also, dismiss the use of leap seconds, so that all hours are 3600 seconds, that makes things much easier. If you need a better resolution than seconds, use milliseconds or floating point numbers, but integers are really handy in many contexts, so keep it in integers, if you can.

Next, realize that day/month/year/hour/minute/second is only necessary when interacting with humans. So do not convert to date and time unless you need to show it to the user, and convert it from date and time as soon as possible after input. As you may note, the conversion requires a ruleset. This is usually specified as a location, like "Europe/Berlin". This way, the conversion knows about historic and planned daylight saving, and other peculiarities about time. For instance, the October revolution in Russia happened in October in Russia, but it was November in Berlin because Germany had switch calendar system, but represented as integer timestamps, that is not a problem. Until modern times, different countries used different calendars, but even in USA, some states operate with many different rulesets, depending on where you live inside the state.

If you want to show a timestamp to the user, you may consider the special case, where there are two date/time combinations that are impossible to differ without extra information. For instance, Europe/Berlin 2009-10-25 02:30. We had two of these today.

Let's take an example: You want to create a chart, where the first axis shows one day, from yesterday until today. You can specify this in multiple ways:

* Start timestamp, End timestamp
* Start timestamp, duration
* End timestamp, duration

You can choose to say "from 2009-10-24 12:00 to 2009-10-25 12:00", or you can say "from 2009-10-24 12:00 and 24 hours onwards". The first choice actually gives 25 hours (!), so you need to make a choice here. If your chart always shows 24 hours, make sure that duration is specified.

Let us assume that we want to create a 24-hour chart. Then, you can simply find the values for the X-axis from the start timestamp, by adding the desired number of seconds. If the start timestamp is the integer number 1256410800, then just add 3600 (1 hour) 24 times, and you have all the timestamps you need. In order to show the chart labels, you need to convert the integers to human readable text. 1256461200 converts to "2009-10-25 02:00" or just "02:00", but the next hour timestamp 1256464800 is also "2009-10-25 02:00". Your entire axis becomes: 12 13 14 15 16 17 18 19 20 21 22 23 00 01 02 02 03 04 05 06 07 08 09 10 11 (2 times "02" and no "12").

The next problem is, how to save this in a database. Integers are obviously easily stored in integer fields. So, is everything solved now? Definitely not. It is absolutely not easy to debug/read integer timestamps in a database using database tools that cannot convert this to something human readable, and many chart tools do not support time-axis with complex conversion to labels like described above. Linux was built for all this, but Windows isn't. When you need to support daylight saving time properly, things get complex on Windows.

Does Windows provide easy, standardized alternative? Unfortunately, not. That's why we're still struggling with a lot of IT systems that do not support daylight saving time well. However, even if you support it well, you still have the problem of users that may not understand how to handle non-existent and duplicate hours.

6 comments:

Stuart King said...

I have to disagree with you. The Windows API does provide an easy and standard way to handle time. It's simple once you understand the two main different formats that the Windows API uses for time:
1) FILETIME
A 64-bit value representing the number of 100-nanosecond intervals since January 1, 1601 (UTC).

2) SYSTEMTIME

typedef struct _SYSTEMTIME {
WORD wYear;
WORD wMonth;
WORD wDayOfWeek;
WORD wDay;
WORD wHour;
WORD wMinute;
WORD wSecond;
WORD wMilliseconds;
} SYSTEMTIME, *PSYSTEMTIME;

I usually use FILETIME to store UTC times, and I usually use SYSTEMTIME when I want to convert to or from text representation of time values.

I find the Windows API much richer than the POSIX functions (I have used both).

Lex Li said...

This has also been a pain for previous .NET applications and now .NET 4 introduces new architecture.

If Embarcadero guys can make good use of the relevant Windows API, Delphi and C++Builder developers can also enjoy the new features, right?

Lex Li said...

forgot the link, http://msdn.microsoft.com/en-us/netframework/dd890508.aspx

Lars D said...

@Stuart: Windows does not have enough time zones to support historic dates, so no matter how many UTC conversion functions you find in the Windows API, it simply cannot handle historic timestamps.

Also, many components in Windows default to using the ymdhms method to describe time information. For instance, the default time picker in windows does not support daylight saving time in its API, and therefore you cannot use it to pick any timestamp.

LDS said...

Some databases support timestamp fields with timezone, thereby that's not an issue as long as you can properly convert to/from, and you can still store them as timestamp data. Using a custom encoding will just make more difficult for other applications to access those data.
Anyway, to handle historic data the timestamp itself must record which timezone it was generate from. And another issue is that sometimes daylight saving time begin/end dates have been changed and could be again. To handle historic data an historic table of daylisght saving times for each country has to be kept updated. IIRC NIST has one.

Lars D said...

@LDS: You cannot save historical timestamps by country. Borders move, and even inside the same country, time rules don't stay the same. The best solution that I have seen, is to specify a city name, like America/Indiana/Marengo or America/Indiana/Knox, together with the timestamp. Or, use a timesystem-independent notation.