Tuesday, 7 August 2018

Why NULL Values Should Not Be Used in a Database Unless Required

When you set up a database (at least in MS SQL Server) you can flag a field as allowing NULL values and which default values to take. If you look through people's DB structures, you'll see that a lot of people allow NULL values in their database. This is a very bad idea. I would recommend never allowing NULL values unless the field can logically have a NULL value (and even this I find this only really happens in DATE/TIME fields).
NULL values cause several problems. For starters, NULL values are not the same as data values. A NULL value is basically an undefined values. On the ColdFusion end, this is not terrible as NULL values come across as empty strings (for the most part). But in SQL, NULL and empty string are very different and act very differently. Take the following data table for example:
id    name
---------------
1     Ben
2     Jim
3     Simon
4     <NULL>
5     <NULL>
6     Ye
7
8
9     Dave
10

This table has some empty strings (id: 7, 8, 10) and some NULL values (id: 4, 5). To see how these behave differently, look at the following query where we are trying to find the number of fields that do not have values:
SELECT
(
SELECT
COUNT( * )
FROM
test t
WHERE
LEN( t.name ) = 0
) AS len_count,
(
SELECT
COUNT( * )
FROM
test t
WHERE
t.name IS NULL
) AS null_count,
(
SELECT
COUNT( * )
FROM
test t
WHERE
t.name NOT LIKE '_%'
) AS like_count,
(
SELECT
COUNT( * )
FROM
test t
WHERE
t.name IS NULL
OR
t.name NOT LIKE '_%'
) AS combo_count

This returns the following record:
LEN Count: 3
NULL Count: 2
LIKE Count: 3
Combo Count: 5
We were looking for 5 as records 4, 5, 7, 8, and 10 do not have values in them. However, you can see that only one attempt returned 5. This is because while a NULL value does NOT have a length, it is not a data type that makes sense with length. How can nothing have or not have a length? It's like asking "What does that math equation smell like?" You can't make comparisons like that.
So, allowing NULL values makes you work extra hard to get the kind of data you are looking for. From a related angle, allowing NULL values reduces your convictions about the data in your database. You can never quite be sure if a value exists or not. Does that make you feel safe and comfortable when programming?
Furthermore, while running LEN() on a NULL value doesn't act as you might think it to, it also does NOT throw an error. This will make debugging your code even harder if you do not understand the difference between NULL values and data values.
Bottom line: DO NOT ALLOW NULL VALUES unless absolutely necessary. You will only be making things harder for yourself

0 comments:

Post a Comment