Monday 9 December 2019

MySQL and timezones

The background
We’re dealing with dates and times, and would love to store them against other records in our MySQL database, and use them in our application in PHP. It would also be nice if we could query by them in a fast and efficient manner.
To set up some more things, there are five time zones you need to deal with at the same time, and three of them are settings on your server:
  1. your ACTUAL time zone. I live in Oxford, so my timezone right now is BST, or UTC+1
  2. your server’s timezone. I have a Digital Ocean box in San Francisco. That’s pacific time, so UTC-7 at the moment.
  3. your mysql’s timezone on the server. That may or may not be the same as the system’s. Usually it is.
  4. your mysql connection’s timezone. Because you can dynamically set it per connection. Aren’t timezones neat?
  5. your php’s timezone. Again, it is usually your system’s, but it might not be.
With that, it would be nice to store dates in a way that we could get back the stored value half a year / year later, after a move to the other side of the world, and it meant the same thing. But ... what should we store them in?

TIMESTAMP

Okay, the most straightforward way to store a date and time is in a TIMESTAMP data type. It has the format of YYYY-MM-DD HH:MM:SS, and has a few neat tricks up its sleeve.
It does date validation! Meaning you can’t store 30th February into it, it’ll store 0000-00-00 00:00:00 instead.
Internally it stores the date as UTC. Which means that the actual value stored will largely depend on your current time_zone setting. If I’m trying to store 2016-06-01 23:52:17 right now, it will store it as 2016-06-01 22:52:17, because that’s the time in UTC. If I were to set the time_zone to something else, like SET time_zone = '+8:00', then saving the same 2016-06-01 23:52:17 date will result in an internal datetime of 2016-06-01 15:52:17.
Because of the above, it follows your timezone changes. Look at this:
mysql> select * from datetypes;
Empty set (0.00 sec)

mysql> insert into datetypes (ts, dt) values (now(), now());
Query OK, 1 row affected (0.00 sec)

mysql> select * from datetypes;
+----+---------------------+---------------------+
| id | ts                  | dt                  |
+----+---------------------+---------------------+
|  1 | 2016-06-01 23:55:29 | 2016-06-01 23:55:29 |
+----+---------------------+---------------------+
1 row in set (0.00 sec)

mysql> set time_zone = '+8:00';
Query OK, 0 rows affected (0.00 sec)

mysql> select * from datetypes;
+----+---------------------+---------------------+
| id | ts                  | dt                  |
+----+---------------------+---------------------+
|  1 | 2016-06-02 06:55:29 | 2016-06-01 23:55:29 |
+----+---------------------+---------------------+
1 row in set (0.00 sec)

mysql> insert into datetypes (ts, dt) values (now(), now());
Query OK, 1 row affected (0.00 sec)

mysql> set time_zone = '+1:00';
Query OK, 0 rows affected (0.00 sec)

mysql> select * from datetypes;
+----+---------------------+---------------------+
| id | ts                  | dt                  |
+----+---------------------+---------------------+
|  1 | 2016-06-01 23:55:29 | 2016-06-01 23:55:29 |
|  2 | 2016-06-01 23:59:22 | 2016-06-02 06:59:22 |
+----+---------------------+---------------------+
2 rows in set (0.00 sec)
I have a table. Column ts is a TIMESTAMP type, dt is DATETIME type. I start with being in UTC+1, I insert a record, change to UTC+8, insert another record (the current time both times), change back to UTC+1, and look at what’s happened.
As you can see even though I inserted the current time (23:59) in two vastly different timezones, the TIMESTAMP value ends up being the same because, well, UTC.
The implication however is that if you insert a certain value in timezone 1 and then change timezones, and you expect the same value to come out in timezone 2, it won’t.
We have an assumption that the dates stored are always in UTC, so from this regard TIMESTAMP is absolutely rubbish.
On top of that, because it’s storing it as UTC in the background as a unix timestamp, and THAT has a limitation because it’s a signed 32 bit number, the minimum date is 1970-01-01 00:00:00 and the maximum is 2038-01-19 03:14:07 UTC (that’s 2^31-1), so you can’t really store things like birthdays of mortgage dates (to keep the examples diverse ;) ).
Also as a related function, FROM_UNIXTIME is also dependent on the current time zone. See this:
mysql> set time_zone = '+0:00';
Query OK, 0 rows affected (0.00 sec)

mysql> select from_unixtime(2147483647);
+---------------------------+
| from_unixtime(2147483647) |
+---------------------------+
| 2038-01-19 03:14:07       |
+---------------------------+
1 row in set (0.00 sec)

mysql> set time_zone = '+8:00';
Query OK, 0 rows affected (0.00 sec)

mysql> select from_unixtime(2147483647);
+---------------------------+
| from_unixtime(2147483647) |
+---------------------------+
| 2038-01-19 11:14:07       |
+---------------------------+
1 row in set (0.00 sec)
TIMESTAMP is great if you want to store “points in time”, for example things like updated_at. That happens at one point, and if you move time zones, the point stays the same because the value will also change. The drawback however is that if you assume that the updated_at is also always in UTC. In that case you’re out of luck again.
And then you also can’t really trust that your site’s timezone is going to be the same as PHP’s timezone, or MySQL’s timezone. All is fun.
One more super fun thing. Say you have a DATETIME field, and you want to convert that into a TIMESTAMP field. The current value of DATETIME will be used as the local timezone variant when determining the internal value of the new TIMESTAMP field as described above.
More important than that though, is that this:
ALTER TABLE table_name MODIFY datetime_column TIMESTAMP
implies this:
ALTER TABLE table_name MODIFY datetime_column TIMESTAMP DEFAULT CURRENT_TIME ON UPDATE CURRENT_TIME
but only if the new column would be the very first TIMESTAMP column in that table, because tables can only have one and only one DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP field (unless you’re running a very up to date MySQL instance, in which case you might be able to achieve this with more than one fields...).
Aren’t databases fantastic? So if you want to avoid that, you need to explicitly tell it not to do that:
ALTER TABLE table_name MODIFY datetime_column TIMESTAMP DEFAULT 0

DATETIME

It looks a lot like TIMESTAMP, but is very different. First of all, it has the same structure: YYYY-MM-DD HH:mm:ss, but then the documentation says this:
The DATETIME type is used for values that contain both date and time parts. MySQL retrieves and displays DATETIME values in 'YYYY-MM-DD HH:MM:SS' format. The supported range is '1000-01-01 00:00:00' to '9999-12-31 23:59:59'.
If you pass in 0 as DATETIME, you will get this:
MySQL [(none)]> select cast(0 as datetime);
+---------------------+
| cast(0 as datetime) |
+---------------------+
| 0000-00-00 00:00:00 |
+---------------------+
1 row in set (0.00 sec)
That's because 0 is treated as false. Null becomes null, and any int becomes null due to them being not valid datetime values.
Interestingly enough even though the documentation says the minimum should be the year 1000, I have successfully cast the year 800 onto it:
MySQL [(none)]> select cast( '0800-02-05 11:11:11' as datetime );
+-------------------------------------------+
| cast( '0800-02-05 11:11:11' as datetime ) |
+-------------------------------------------+
| 0800-02-05 11:11:11                       |
+-------------------------------------------+
1 row in set (0.00 sec)
This type, as opposed to TIMESTAMP, does not hold timezone information, so what you put in is what you’re going to get out, regardless of timezones. See the example about timezone changes previously.
The good thing here is that if your code assumes that all time information stored in the database is in UTC, this is perfect. That however means you need to make sure your code passes in UTC timezoned string, and when it reads back it will turn it back into whatever normal timezone the site / application has.

What about INTEGER?

If you store a datetime as YYYY-MM-DD HH:mm:ss, you might as well do it in a unix timestamp. The good thing about it is that it’s so much faster to query by than a piece of string. The bad news is that the data stored in the database will be less human-friendly. When you look at something like 2016-06-06 22:21:49, you know immediately what it means, but if you look at 1465251709, you have no idea. Not even a mild inkling of where that might be in relation to "now". The two are the same, by the way.
On the other hand, I argue that once you’re at the point where you need performance where this matters, then the database stops being something to look at for humans. Go and build pretty interfaces for it, and make sure your code can handle converting between formats and timezones.

How to tell what my timezone offset of the database is?

It’s surprisingly hard to find that information out. Involves reading settings which you may or may not have access to:
MySQL [(none)]> SELECT @@global.time_zone, @@session.time_zone;
+--------------------+---------------------+
| @@global.time_zone | @@session.time_zone |
+--------------------+---------------------+
| SYSTEM             | SYSTEM              |
+--------------------+---------------------+
1 row in set (0.00 sec)
Well, gee, thanks a bunch... -.- I have no idea what SYSTEM is unless I have access to the server itself.
Do this instead:
MySQL [(none)]> select timediff( now(), utc_timestamp() );
+------------------------------------+
| timediff( now(), utc_timestamp() ) |
+------------------------------------+
| 01:00:00                           |
+------------------------------------+
1 row in set (0.00 sec)
It means my server is in UTC+1 at the moment.
Hope I shed some light on time stuff. To recap:

What should I store my dates in?

  1. Can your application supply / read / assume that all dates are UTC, and you want performance? -> INTEGER as a unix timestamp.
  2. Can your application supply / read / assume that all dates are UTC, or you need to store dates older than 1970 or after 2038, but want them to be human readable? => DATETIME
  3. Do you need a fixed point in time regardless of timezones, and will your code assume the value you get is in the local timezone of the MySQL server? => TIMESTAMP

0 comments:

Post a Comment