The other day, we ran into interesting little problem. We were using MySQL's GROUP_CONCAT() function to turn an aggregate list into a comma-delimited list. But things weren't working. No errors - just confusing behavior. After logging a lot of output, we realized that the GROUP_CONCAT() function was truncating its output without raising an exception.
To see what I mean, take a look at this demo. Here, we're going to create a fake transaction log that associates a transaction ID with a day of the week (bear with me on this silly example). Then, we're going to report on those transactions by grouping the data by the day of the week and collapsing the relevant transaction IDs into a single list.
As you can see from the INSERT statements, each day of the week should have 1,000 transaction IDs associated with it. And yet, when we run the above code and output the report, we get the following:
Day Of Week (1): 147
Day Of Week (2): 147
Day Of Week (3): 147
Day Of Week (4): 147
Day Of Week (5): 147
Day Of Week (6): 147
Day Of Week (7): 147
Day Of Week (2): 147
Day Of Week (3): 147
Day Of Week (4): 147
Day Of Week (5): 147
Day Of Week (6): 147
Day Of Week (7): 147
Here, each day of the week is reporting only 147 transactions, not the full 1,000 that we inserted. The problem is that GROUP_CONCAT() has a native default limit of 1,024 characters (which is further constrained by the maximum size of the packet that MySQL can return). Ultimately, though, the real problem is not that GROUP_CONCAT() has a limit, it's that it failed silently. Instead of throwing a SQL error, it just truncated the value. And, as you can imagine, if the truncation happens mid-transactionID, we're not just dealing with missing data, we're now dealing with false data.
It's easy enough to refactor the algorithms so that they don't use GROUP_CONCAT(). But, I figured I'd share this since we were banging our heads against the wall for a while trying to figure out what was going on. In the long run, I'm not sure that I can think of a great use-case for GROUP_CONCAT(); and these limitations definitely strengthen that perception.
0 comments:
Post a Comment