Monday, 12 November 2018

MySQL load NULL values from CSV data

I have a file that can contain from 3 to 4 columns of numerical values which are 
separated by comma. Empty fields are defined with the exception when they are at 
the end of the row:
1,2,3,4,5
1,2,3,,5
1,2,3
The following table was created in MySQL:
+-------+--------+------+-----+---------+-------+
| Field | Type   | Null | Key | Default | Extra |
+-------+--------+------+-----+---------+-------+
| one   | int(1) | YES  |     | NULL    |       | 
| two   | int(1) | YES  |     | NULL    |       | 
| three | int(1) | YES  |     | NULL    |       | 
| four  | int(1) | YES  |     | NULL    |       | 
| five  | int(1) | YES  |     | NULL    |       | 
+-------+--------+------+-----+---------+-------+
I am trying to load the data using MySQL LOAD command:
LOAD DATA INFILE '/tmp/testdata.txt' INTO TABLE moo FIELDS 
TERMINATED BY "," LINES TERMINATED BY "\n";
The resulting table:
+------+------+-------+------+------+
| one  | two  | three | four | five |
+------+------+-------+------+------+
|    1 |    2 |     3 |    4 |    5 | 
|    1 |    2 |     3 |    0 |    5 | 
|    1 |    2 |     3 | NULL | NULL | 
+------+------+-------+------+------+
The problem lies with the fact that when a field is empty in the raw data and is not 
defined, MySQL for some reason does not use the columns default value (which is NULL) 
and uses zero. NULL is used correctly when the field is missing altogether.
Unfortunately, I have to be able to distinguish between NULL and 0 at this stage so any
 help would be appreciated.
Thanks S.
edit
The output of SHOW WARNINGS:
+---------+------+--------------------------------------------------------+
| Level   | Code | Message                                                |
+---------+------+--------------------------------------------------------+
| Warning | 1366 | Incorrect integer value: '' for column 'four' at row 2 | 
| Warning | 1261 | Row 3 doesn't contain data for all columns             | 
| Warning | 1261 | Row 3 doesn't contain data for all columns             | 
+---------+------+--------------------------------------------------------+ 

Answers


This will do what you want. It reads the fourth field into a local variable, and then 
sets the actual field value to NULL, if the local variable ends up containing an empty string:
LOAD DATA infile '/tmp/testdata.txt'
INTO TABLE moo
fields terminated BY ","
lines terminated BY "\n"
(one, two, three, @vfour, five)
SET four = nullif(@vfour,'')
;
If they're all possibly empty, then you'd read them all into variables and have multiple
 SET statements, like this:
LOAD DATA infile '/tmp/testdata.txt'
INTO TABLE moo
fields terminated BY ","
lines terminated BY "\n"
(@vone, @vtwo, @vthree, @vfour, @vfive)
SET
one = nullif(@vone,''),
two = nullif(@vtwo,''),
three = nullif(@vthree,''),
four = nullif(@vfour,'')
;



The behaviour is different depending upon the database configuration. In the strict 
mode this would throw an error else a warning. Following query may be used for identifying the database configuration.
mysql> show variables like 'sql_mode';

0 comments:

Post a Comment