Mysql: Is a 'blackhole' table evil? ~ Tech Blog

Friday, 9 November 2018

Mysql: Is a 'blackhole' table evil?

Reading to this question i've just learned the existence of the blackhole table trick: basically consist in using a single table to insert data, and then a trigger that split the data in many other tables.

Im wondering if this could cause problems, once the developers whos working on the project are aware of that.

What are the pro and cons of this tecnique?

Edit: The blink I got in mind when I saw the example, is about transactions: if for some reason the transaction fail, you'll find the blackhole row with the original data, for historical purpose and maybe a help with debug - but this seems to be the only +1 i can see with blackholes. Ideas?

Answers

I don't think blackhole has any real pros.

Writing the trigger code to move data around is probably not noticably less work than writing the code to insert the data in the right place in the first place.

As Christian Oudard writes, it doesn't reduce complexity - just moves it to a place where it's really hard to debug.

On the downside:

"Side effects" are usually a bad idea in software development. Triggers are side effects - I intend to do one thing (insert data in a table), and it actually does lots of other things. Now, when I'm debugging my code, I have to keep all the side effects in my head too - and the side effects could themselves have side effects.

most software spends far more time in maintenance than it does in development. Bringing new developers into the team and explaining the black hole trick is likely to increase the learning curve - for negligible benefit (in my view).

Because triggers are side effects, and it's relatively easy to set off a huge cascade of triggers if you're not careful, I've always tried to design my databases without a reliance on triggers; where triggers are clearly the right way to go, I've only let my most experienced developers create them. The black hole trick makes triggers into a normal, regular way of working. This is a personal point of view, of course.

This doesn't look like a good idea. If you're trying to keep the front end code simple, why not just use a stored procedure? If it's not to keep the front end code simple, I don't understand the point at all.

Each time you insert a row into a table, the odds are that you are writing to the same area of the hard drive or the same page (in MS-SQL world, I don't know about postgresql), so this technique will likely lead to contention and locking as all transactions are now competing to write to the same table.

Also this will halve insert performance since inserts require two inserts instead of one.

And this is denormalization since there are now two copies of the data instead of one.

Don't do this. The fact that it's called a trick and not a standard way of doing something says enough for me.

This totally kills the normal usage pattern of the relational model. Not sure that it actually kills normal form as you can still have that all in place. It's just messing with the way data is making it to the destination tables. Looks like a performance nightmare on top of a maintenance nightmare. Imagine one table having a trigger that has to fire for 1,800 plus table inserts for example. That just makes me feel sick.

This is a interesting parlor trick nothing more.

Tech Blog

Friday, 9 November 2018